Java URLConnection不允许我访问有关Http错误的数据(404500等)
我正在制作一个爬虫程序,需要从流中获取数据,不管它是否为200。CURL正在做这件事,就像任何标准浏览器一样 以下内容实际上不会获取请求的内容,即使存在一些异常,也会抛出一个包含http错误状态代码的异常。我想要输出,不管怎样,有办法吗?我更喜欢使用这个库,因为它实际上会进行持久连接,这非常适合我正在进行的爬行类型Java URLConnection不允许我访问有关Http错误的数据(404500等),java,urlconnection,Java,Urlconnection,我正在制作一个爬虫程序,需要从流中获取数据,不管它是否为200。CURL正在做这件事,就像任何标准浏览器一样 以下内容实际上不会获取请求的内容,即使存在一些异常,也会抛出一个包含http错误状态代码的异常。我想要输出,不管怎样,有办法吗?我更喜欢使用这个库,因为它实际上会进行持久连接,这非常适合我正在进行的爬行类型 package test; import java.net.*; import java.io.*; public class Test { public static
package test;
import java.net.*;
import java.io.*;
public class Test {
public static void main(String[] args) {
try {
URL url = new URL("http://github.com/XXXXXXXXXXXXXX");
URLConnection connection = url.openConnection();
DataInputStream inStream = new DataInputStream(connection.getInputStream());
String inputLine;
while ((inputLine = inStream.readLine()) != null) {
System.out.println(inputLine);
}
inStream.close();
} catch (MalformedURLException me) {
System.err.println("MalformedURLException: " + me);
} catch (IOException ioe) {
System.err.println("IOException: " + ioe);
}
}
}
工作了,谢谢:以下是我的想法——只是一个粗略的概念证明:
import java.net.*;
import java.io.*;
public class Test {
public static void main(String[] args) {
//InputStream error = ((HttpURLConnection) connection).getErrorStream();
URL url = null;
URLConnection connection = null;
String inputLine = "";
try {
url = new URL("http://verelo.com/asdfrwdfgdg");
connection = url.openConnection();
DataInputStream inStream = new DataInputStream(connection.getInputStream());
while ((inputLine = inStream.readLine()) != null) {
System.out.println(inputLine);
}
inStream.close();
} catch (MalformedURLException me) {
System.err.println("MalformedURLException: " + me);
} catch (IOException ioe) {
System.err.println("IOException: " + ioe);
InputStream error = ((HttpURLConnection) connection).getErrorStream();
try {
int data = error.read();
while (data != -1) {
//do something with data...
//System.out.println(data);
inputLine = inputLine + (char)data;
data = error.read();
//inputLine = inputLine + (char)data;
}
error.close();
} catch (Exception ex) {
try {
if (error != null) {
error.close();
}
} catch (Exception e) {
}
}
}
System.out.println(inputLine);
}
}
简单:
URLConnection connection = url.openConnection();
InputStream is = connection.getInputStream();
if (connection instanceof HttpURLConnection) {
HttpURLConnection httpConn = (HttpURLConnection) connection;
int statusCode = httpConn.getResponseCode();
if (statusCode != 200 /* or statusCode >= 200 && statusCode < 300 */) {
is = httpConn.getErrorStream();
}
}
调用
openConnection
后,需要执行以下操作
(成功测试应该是
200“InputStream is=connection.getResponseMessage();”我在URLConnection类中没有看到getResponseMessage方法,它是HttpUrlConnection的一部分,所以我们不应该对它进行类型转换吗?或者我们可以用getInputStream替换getResponseMessage,或者它会引发异常吗?这是一个输入错误,它是连接。getInputStream()
。回答得非常好而且简短
URLConnection connection = url.openConnection();
InputStream is = null;
try {
is = connection.getInputStream();
} catch (IOException ioe) {
if (connection instanceof HttpURLConnection) {
HttpURLConnection httpConn = (HttpURLConnection) connection;
int statusCode = httpConn.getResponseCode();
if (statusCode != 200) {
is = httpConn.getErrorStream();
}
}
}