Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/java/340.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何使用utf8字符正确读取url内容?_Java_Url_Encode_Utf - Fatal编程技术网

Java 如何使用utf8字符正确读取url内容?

Java 如何使用utf8字符正确读取url内容?,java,url,encode,utf,Java,Url,Encode,Utf,以及: 当我运行这个时,我得到:{“句子”[{“trans”:“end”,“orig”:“koďż˝”;“translit”:“srcżtranslit”:“}],“src”:“pl”,“server_time”:30} 所以utf不能正常工作,但如果我返回编码的url:http://translate.google.com/translate_a/t?client=o&text=ko%C5%84&hl=en&sl=pl&tl=en并粘贴在url栏上,我得到正确的信息:{“句子”:[{“翻译”:

以及:

当我运行这个时,我得到:
{“句子”[{“trans”:“end”,“orig”:“koďż˝”;“translit”:“srcżtranslit”:“}],“src”:“pl”,“server_time”:30}
所以utf不能正常工作,但如果我返回编码的url:
http://translate.google.com/translate_a/t?client=o&text=ko%C5%84&hl=en&sl=pl&tl=en
并粘贴在url栏上,我得到正确的信息:
{“句子”:[{“翻译”:“马”,“原语”:“kon”,“translit”:“,”src_translit”:“}],”dict:[{“pos”:“名词”,“术语”:[“马”]}],“src”:“pl”,“服务器时间”:76}

public class AbcServlet extends HttpServlet {
 public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException {
  resp.setContentType("text/plain;charset=UTF-8");
  resp.getWriter().println(new String(URLReader.read("pl", "en", "koń")));
 }
}
为您提供UTF-8字节序列,因此URLReader.read也为您提供UTF-8字节序列

但是您尝试使用而不指定编码器进行解码,即
新字符串(URLReader.read(“pl”、“en”、“kon”))
因此Java将使用您的系统默认编码进行解码(这不是UTF-8)

尝试:

更新

以下是我的机器上的完整工作代码:

new String(URLReader.read("pl", "en", "koń"), "UTF-8")
别忘了逃到\u0144。Java编译器可能无法正确编译Unicode文本,因此最好使用纯ASCII编写

public class URLReader {

    public static byte[] read(String from, String to, String string) {
        try {
            String text = "http://translate.google.com/translate_a/t?"
                    + "client=o&text=" + URLEncoder.encode(string, "UTF-8")
                    + "&hl=en&sl=" + from + "&tl=" + to + "";
            URL url = new URL(text);
            URLConnection conn = url.openConnection();
            // Look like faking the request coming from Web browser solve 403 error
            conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");
            BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
            String json = in.readLine();
            byte[] bytes = json.getBytes("UTF-8");
            in.close();
            return bytes;
            //return text.getBytes();
        } catch (Exception e) {
            System.out.println(e);
            // becarful with returning null. subsequence call will return NullPointException.
            return null;
        }
    }
}

hmm现在返回{“句子”:[{“trans”:“end”,“orig”:“ko”�","翻译“:”,“src_translit:”}],“src:“pl”,“server_time:”20}这是从您的web浏览器得到的吗?处理编码字节时不要使用PrinWriter。PrintWriter将使用不是UTF-8的JVM默认编码器。尝试getOutputStream.write((新字符串(URLReader.read(“pl”、“en”、“kon”)、“UTF-8”))。getBytes(“UTF-8”))注意设置resp.setContentType(“text/plain;charset=UTF-8”);不会真正告诉servlet使用UTF-8对其进行编码。只需通知目标web浏览器/客户端,您将发送一个用UTF-8编码的字节流。实际内容编码不需要与内容类型标头匹配。(当然你不想这样)我不需要写这个,我需要正确地将数据保存到数据库,但我看不到一个好方法来确定我尝试了你的代码,但我从谷歌服务器得到了403个错误。它不允许我使用它的翻译。
new String(URLReader.read("pl", "en", "koń"), "UTF-8")
public class URLReader {

    public static byte[] read(String from, String to, String string) {
        try {
            String text = "http://translate.google.com/translate_a/t?"
                    + "client=o&text=" + URLEncoder.encode(string, "UTF-8")
                    + "&hl=en&sl=" + from + "&tl=" + to + "";
            URL url = new URL(text);
            URLConnection conn = url.openConnection();
            // Look like faking the request coming from Web browser solve 403 error
            conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13 (.NET CLR 3.5.30729)");
            BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
            String json = in.readLine();
            byte[] bytes = json.getBytes("UTF-8");
            in.close();
            return bytes;
            //return text.getBytes();
        } catch (Exception e) {
            System.out.println(e);
            // becarful with returning null. subsequence call will return NullPointException.
            return null;
        }
    }
}
public class AbcServlet extends HttpServlet {

    @Override
    public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException {
        resp.setContentType("text/plain;charset=UTF-8");
        byte[] read = URLReader.read("pl", "en", "ko\u0144");
        resp.getOutputStream().write(read) ;
    }
}