Java 响应编码HTTPconnection获取HTML文本
我试着阅读回复(不仅是这个回复,还有很多来自这个网站的回复),下面是我的函数代码:Java 响应编码HTTPconnection获取HTML文本,java,post,http-post,httpresponse,httpconnection,Java,Post,Http Post,Httpresponse,Httpconnection,我试着阅读回复(不仅是这个回复,还有很多来自这个网站的回复),下面是我的函数代码: // HTTP POST request private void sendFirstPost() throws Exception { String url = "http://g1.botva.ru/login.php"; URL obj = new URL(url); HttpURLConnection con = (HttpURLConne
// HTTP POST request
private void sendFirstPost() throws Exception {
String url = "http://g1.botva.ru/login.php";
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
con.setInstanceFollowRedirects(false);
//add reuqest header
con.setRequestMethod("POST");
con.setRequestProperty("Accept", "*/*");
con.setRequestProperty("Accept-Encoding", "gzip, deflate");
//con.setRequestProperty("Content-Length", "86");
con.setRequestProperty("Content-Type", "application/x-www-form-urlencoded");
con.setRequestProperty("User-Agent", "runscope/0.1");
String urlParameters = "do_cmd=login&remember=1&password=avmalyutin1234&server=1&email=avmalyutin%40mail.ru";
// Send post request
con.setDoOutput(true);
DataOutputStream wr = new DataOutputStream(con.getOutputStream());
wr.writeBytes(urlParameters);
wr.flush();
wr.close();
int responseCode = con.getResponseCode();
System.out.println("\nSending 'POST' request to URL : " + url);
System.out.println("Post parameters : " + urlParameters);
System.out.println("Response Code : " + responseCode);
System.out.println("Content Type : " + con.getContentType());
BufferedReader in = new BufferedReader(
new InputStreamReader(con.getInputStream(), "cp1251"));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
//print result
System.out.println(response.toString());
byte [] array = response.toString().getBytes("cp1251");
String buffff = new String(array);
System.out.println(buffff);
}
作为一种内容类型,我得到text/html;charset=cp1251。我尝试使用编码cp1251,windows-1251——没有好的结果。有一次我设法得到一个HTML文本,但在那之后,future将在不更改任何源代码的情况下启动,输出的只是不可读的符号。那么,如何才能正确地从Response获取类似HTML的文本呢?虽然标题说编码是
Cp1251
,但事实并非如此。服务器正在发送与Cp1252
对应的字节
检查的一种方法是首先知道您将接收哪些字节:
String text = "Áîòâà Îíëàéí | Áèòâà çà ðåàëüíóþ êàïóñòó!";
for (byte n : text.getBytes("Cp1251")) {
System.out.printf("%d ", n);
}
System.out.println();
for (byte n : text.getBytes("Cp1252")) {
System.out.printf("%d ", n);
}
System.out.println();
然后在收到的字节中查找它们:
for(int n; (n = inputStream.read()) > 0; ) {
System.out.printf("%d ", (byte) n);
}