Java URL连接字符编码
我正在尝试读取JSON字符串:Java URL连接字符编码,java,encoding,httpurlconnection,Java,Encoding,Httpurlconnection,我正在尝试读取JSON字符串: { "also_known_as": [ "Сильвестр Сталлоне" ], "birthday": "1946-07-06", "deathday": "", } 通过HTTP 我有以下代码: URL url = new URL("url"); HttpURLConnection connection = (HttpURLConnection) url.openConnection(); connection.setReq
{
"also_known_as": [
"Сильвестр Сталлоне"
],
"birthday": "1946-07-06",
"deathday": "",
}
通过HTTP
我有以下代码:
URL url = new URL("url");
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestProperty("Accept-Charset", "UTF-8");//connection.setRequestProperty("Accept-Charset", "ISO-8859-1");
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream()));
String line = "";
StringWriter writer = new StringWriter();
while((line=reader.readLine())!=null){
writer.write(line);
}
reader.close();
writer.close();
connection.disconnect();
System.out.println(writer.toString());
但它正在控制台中打印字符串:
{
"also_known_as": [
"СильвеÑ?Ñ‚Ñ€ Сталлоне"
],
"birthday": "1946-07-06",
"deathday": "",
}
我也尝试过:
BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream(), "UTF-8"));//BufferedReader reader = new BufferedReader(new InputStreamReader(connection.getInputStream(), "ISO-8859-1"));
但是没有运气
我的问题是如何设置URLConnection的字符编码
任何信息都会对我很有帮助
问候
使用Apache IOUtils,我尝试了以下方法:
StringWriter writer = new StringWriter();
IOUtils.copy(connection.getInputStream(), writer, "UTF-8");
但它在eclipse控制台中打印相同的结果
使用Apache HttpClient:
DefaultHttpClient httpClient = new DefaultHttpClient();
HttpGet getRequest = new HttpGet("http://api.themoviedb.org/3/person/16483?api_key=23e89da030a0ee8b25aaed20950a0c25");
getRequest.addHeader("accept", "application/json");
HttpResponse response = httpClient.execute(getRequest);
StringWriter writer = new StringWriter();
IOUtils.copy(response.getEntity().getContent(), writer, "UTF-8");
System.out.println(writer.toString());
同样的结果。这样做:
new InputStreamReader(connection.getInputStream(), new Charset("UTF-8"))
i、 e.指定该字符集。只是将我的注释弹出到一个答案中,这就是原因:控制台的字符集是Cp1252,因此输出是正确的,但显示不正确。您必须使用正确的字符集对BufferedReader或StringWriter进行编码Charset@mKorbel我试过使用IOUtils。我已经编辑了我的答案。但它也给出了同样的结果(您确定不是您的控制台的字符集设置不正确吗?Windows操作系统存在一个常见问题,即非常复杂的简单操作,您必须搜索字符集或Windows EncodePage以查找
String my String=new String(reader.toByteArray(),charEncoding);
@SeanOwen,eclipse控制台的编码设置为默认-继承(Cp1252),我将其更改为UTF-8,现在它正在正确打印。谢谢。这是错误的。您必须检查HTTP头的字符集。您不能假设UTF-8。@tchrist是的。在这种情况下,connection.getContentType()
和parse.@Manish不幸的是,许多网站不关心正确指定字符集。在这种情况下,标准是什么?@Ingo-根据“text”的默认字符集通过HTTP接收的MIME内容类型是ISO-8859-1。今天,欧盟因未提供不同的浏览器而对Microsoft处以5亿欧元的罚款。这是不公平的,IMHO。公正的做法是:每天罚款5亿欧元。这该死的、毫无用处的非标准CP1252只是被设置为标准编码,没有办法改变这一点!