Java使用httpclient 4.1爬网https获取坏记录
使用以下代码对https网站进行爬网,在大多数网站上运行良好,但在grubhub.com等网站上不起作用Java使用httpclient 4.1爬网https获取坏记录,java,ssl,https,httpclient,Java,Ssl,Https,Httpclient,使用以下代码对https网站进行爬网,在大多数网站上运行良好,但在grubhub.com等网站上不起作用 import java.io.IOException; import java.security.cert.X509Certificate; import javax.net.ssl.SSLContext; import javax.net.ssl.SSLException; import javax.net.ssl.SSLSession; import javax.net.ssl.SSL
import java.io.IOException;
import java.security.cert.X509Certificate;
import javax.net.ssl.SSLContext;
import javax.net.ssl.SSLException;
import javax.net.ssl.SSLSession;
import javax.net.ssl.SSLSocket;
import javax.net.ssl.TrustManager;
import javax.net.ssl.X509TrustManager;
import org.apache.http.HttpEntity;
import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpGet;
import org.apache.http.conn.ClientConnectionManager;
import org.apache.http.conn.scheme.Scheme;
import org.apache.http.conn.scheme.SchemeRegistry;
import org.apache.http.conn.ssl.SSLSocketFactory;
import org.apache.http.conn.ssl.X509HostnameVerifier;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;
public class TestSsl {
HttpClient httpclient = new DefaultHttpClient();
public static void main(String[] args) throws Exception {
TestSsl t = new TestSsl();
t.init();
t.queryData();
}
private void init() {
httpclient = wrapClient(httpclient);
httpclient.getParams()
.setIntParameter("http.socket.timeout", 30 * 1000);
}
/**
*
* @throws Exception
*/
private void queryData() throws Exception {
HttpGet httpget = new HttpGet("https://www.grubhub.com/browse/");
HttpResponse response = httpclient.execute(httpget);
HttpEntity entity = response.getEntity();
String result = EntityUtils.toString(entity);
System.out.println(result);
EntityUtils.consume(entity);
}
/**
*
* @param base
* @return
*/
public static HttpClient wrapClient(HttpClient base) {
try {
SSLContext ctx = SSLContext.getInstance("SSL");
X509TrustManager tm = new X509TrustManager() {
public void checkClientTrusted(X509Certificate[] xcs,
String string) {
}
public void checkServerTrusted(X509Certificate[] xcs,
String string) {
}
public X509Certificate[] getAcceptedIssuers() {
return null;
}
};
X509HostnameVerifier verifier = new X509HostnameVerifier() {
@Override
public void verify(String string, SSLSocket ssls) throws IOException {
}
@Override
public void verify(String string, X509Certificate xc) throws SSLException {
}
@Override
public void verify(String string, String[] strings, String[] strings1) throws SSLException {
}
@Override
public boolean verify(String string, SSLSession ssls) {
return true;
}
};
ctx.init(null, new TrustManager[] { tm }, null);
SSLSocketFactory ssf = new SSLSocketFactory(ctx);
ssf.setHostnameVerifier(verifier);
ClientConnectionManager ccm = base.getConnectionManager();
SchemeRegistry sr = ccm.getSchemeRegistry();
sr.register(new Scheme("https", ssf, 443));
return new DefaultHttpClient(ccm, base.getParams());
} catch (Exception ex) {
ex.printStackTrace();
return null;
}
}
}
我甚至在jvm上设置了以下内容:-Djavax.net.debug=all-Dhttps.protocols=SSLv3-Dforce.http.jre.executor=true
主,读取:SSLv3警报,长度=2
主,RECV SSLv3警报:致命,坏记录\u mac
main,称为closeSocket()
主,处理异常:javax.net.ssl.SSLException:收到致命警报:坏记录\u mac
main,称为close()
main,称为closeInternal(true)
main,称为close()
main,称为closeInternal(true)
main,称为close()
main,称为closeInternal(true)
线程“main”javax.net.ssl.SSLException中的异常:收到致命警报:坏记录\u mac
我已经研究过了,在代码中加入了很多东西,但还是没有运气。调试输出太大,无法在此处发布,但可以通过在IDE中创建新类并运行它来轻松复制
非常感谢您的帮助。您遇到了一个已知的JR问题。您正在尝试与不支持TLS的服务器进行TLS对话。查看更多信息 建议的解决方法是强制sslv3。 不幸的是,试图用系统属性来实现这一点是行不通的,因此需要通过在每个套接字上将启用的协议设置为SSLv3来实现 下面是一个适用于主机的小应用程序:
String urlString = "https://www.grubhub.com/browse/";
URL url = new URL(urlString);
SSLSocketFactory factory = (SSLSocketFactory) SSLSocketFactory.getDefault();
SSLSocket socket = (SSLSocket) factory.createSocket(url.getHost(), 443);
socket.setEnabledProtocols(new String[]{"SSLv3"}); // <--- THIS IS THE WORK-AROUND
PrintWriter out = new PrintWriter(
new OutputStreamWriter(
socket.getOutputStream()));
out.println("GET " + urlString + " HTTP/1.1");
out.println();
out.flush();
BufferedReader in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
String line;
while ((line = in.readLine()) != null) {
System.out.println(line);
}
out.close();
in.close();
我设置了这个,但没有帮助。请尝试一下,让我知道它是否适合你,也许是我的环境出了问题。谢谢。@xigua-你说得对。设置属性不起作用。我已经用一个有效的解决方案更新了我的答案
socket.setEnabledProtocols(new String[]{"SSLv3"});