Java 服务器返回了HTTP响应代码:406 a URL
我正在使用Java和HttpURLConnection编写一个web爬虫程序,这是我得到的错误:Java 服务器返回了HTTP响应代码:406 a URL,java,web-crawler,jsoup,httpurlconnection,Java,Web Crawler,Jsoup,Httpurlconnection,我正在使用Java和HttpURLConnection编写一个web爬虫程序,这是我得到的错误: java.io.IOException: Server returned HTTP response code: 406 for URL: https://www.mkyong.com/kotlin/kotlin-how-to-loop-a-map/ at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Sour
java.io.IOException: Server returned HTTP response code: 406 for URL: https://www.mkyong.com/kotlin/kotlin-how-to-loop-a-map/
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at testing.HttpURLConnectionGo.sendGet(HttpURLConnectionGo.java:34)
at testing.DefinitelyNotSpiderLeg.crawl(DefinitelyNotSpiderLeg.java:55)
at testing.DefinitelyNotSpider.search(DefinitelyNotSpider.java:33)
at testing.Test.main(Test.java:9)
这是我用于连接的方法:
// HTTP GET request
public String sendGet(String url) throws Exception {
URL obj = new URL(url);
HttpURLConnection con = (HttpURLConnection) obj.openConnection();
// optional default is GET
con.setRequestMethod("GET");
//add request header
con.setRequestProperty("User-Agent", USER_AGENT);
BufferedReader in = new BufferedReader(
new InputStreamReader(con.getInputStream()));
String inputLine;
StringBuffer response = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
return response.toString();
}
然后我使用Jsoup在另一个类中获取字符串:
String html = http.sendGet(url);
Document doc = Jsoup.parse(html);
为什么会出现此错误?根据和,HTTP 406的状态为“不可接受”。Mozilla参考继续说,这种错误非常罕见,通常意味着
指示与可接受值列表匹配的响应
无法提供在Accept字符集和Accept语言中定义的
您可以尝试在请求中设置这些标题
也可能是您正在点击的URL具有某种机器人或爬虫程序检测逻辑,并且返回406,因为该行为“不可接受”。该用例不是理想的错误代码,但它是有意义的