Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/url/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
无法下载java中的特定URL_Java_Url_Download_Apache Commons - Fatal编程技术网

无法下载java中的特定URL

无法下载java中的特定URL,java,url,download,apache-commons,Java,Url,Download,Apache Commons,我正在编写以下程序,以使用Apache Common IO下载URL,并收到ReadTimeOut异常, 例外情况 java.net.SocketTimeoutException: Read timed out at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(Unknown Source) at java.net.SocketInputStream.

我正在编写以下程序,以使用Apache Common IO下载URL,并收到ReadTimeOut异常, 例外情况

java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at java.net.SocketInputStream.read(Unknown Source)
at sun.security.ssl.InputRecord.readFully(Unknown Source)
at sun.security.ssl.InputRecord.read(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readRecord(Unknown Source)
at sun.security.ssl.SSLSocketImpl.readDataRecord(Unknown Source)
at sun.security.ssl.AppInputStream.read(Unknown Source)
at java.io.BufferedInputStream.fill(Unknown Source)
at java.io.BufferedInputStream.read1(Unknown Source)
at java.io.BufferedInputStream.read(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTPHeader(Unknown Source)
at sun.net.www.http.HttpClient.parseHTTP(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(Unknown Source)
at java.net.URL.openStream(Unknown Source)
at org.apache.commons.io.FileUtils.copyURLToFile(FileUtils.java:1456)
at com.touseef.stock.FileDownload.main(FileDownload.java:23)
节目

String urlStr = "https://www.nseindia.com/";
    File file = new File("C:\\User\\WorkSpace\\Output.txt");
    URL url;
    try {
        url = new URL(urlStr);
        FileUtils.copyURLToFile(url, file);
        System.out.println("Successfully Completed.");
    } catch (MalformedURLException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
其他网站都可以下载。请建议。
使用commons-io-2.6 jar

该网站似乎受到某些web网关DOS保护服务(如Akamai?)的保护?。客户端似乎通过TLS连接和HTTP请求头进行指纹识别,只有有效的web浏览器才能连接到该站点

以下代码至少在当前使用和工作:

    String urlStr = "https://www.nseindia.com/";
    File file = new File("C:\\User\\WorkSpace\\Output.txt");
    String userAgent = "-";

    CloseableHttpClient httpclient = HttpClients.custom().setUserAgent(userAgent).build();
    HttpGet httpget = new HttpGet(urlStr);
    httpget.addHeader("Accept-Language", "en-US");
    httpget.addHeader("Cookie", "");

    System.out.println("Executing request " + httpget.getRequestLine());
    try (CloseableHttpResponse response = httpclient.execute(httpget)) {
        System.out.println("----------------------------------------");
        System.out.println(response.getStatusLine());
        String body = EntityUtils.toString(response.getEntity());
        System.out.println(body);
        Files.writeString(file.toPath(), body);
    }
例如,在Firefox中工作的请求在Java中不工作,因为使用协议和密码的TLS连接是不同的。我使用ApacheCommonsHTTP客户端尝试了几种组合。但是,即使Fiddler发出了相同的请求,is也失败了

因此,从Java内部使用这个网站是非常困难的,即使上面的代码目前也可以工作,保护系统可以随时进行调整,这样就不会再次工作


我假设这样一个站点提供了一个专门用于程序使用的API。联系他们并询问,这是我能给你的唯一建议

使用不同的方法copyURLToFileURL源、文件目标、int-connectionTimeout、int-readTimeout并指定更长的超时时间。由于读取超时太小,因此出现错误。感谢您的评论。使用了不同的方法,超时时间更长,仍然存在相同的问题。仅限于特定网站。其他网站很容易访问。上述网站正在web浏览器中无缝打开。此问题特定于此服务器。我假设它需要特定的HTTP头,否则它不会返回任何结果。因此,您根本不能使用FileUtils.copyURLToFile。您必须手动打开HTTP连接,设置请求头并继续。谢谢您的回答。通过使用ui4j和jsoup api得到修复。