Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/reactjs/21.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何从HTMLUnit中的Javascript链接下载文件_Javascript_Java_Htmlunit - Fatal编程技术网

如何从HTMLUnit中的Javascript链接下载文件

如何从HTMLUnit中的Javascript链接下载文件,javascript,java,htmlunit,Javascript,Java,Htmlunit,正如标题所说,我正试图从javascript链接下载一个带有HTMLUnit的文件 我现在开始的页面是。当我在浏览器中单击“AuthenticationwithJavaWebStart(new method)”链接时,会下载一个.jnlp文件,然后运行该文件打开一个Java程序窗口,该窗口要求提供身份验证信息。一旦身份验证成功,原始浏览器窗口将加载包含我将要抓取的信息的页面 起始页中的链接源代码片段为: <tr> <!-- onClick="return launchWebSt

正如标题所说,我正试图从javascript链接下载一个带有HTMLUnit的文件

我现在开始的页面是。当我在浏览器中单击“AuthenticationwithJavaWebStart(new method)”链接时,会下载一个.jnlp文件,然后运行该文件打开一个Java程序窗口,该窗口要求提供身份验证信息。一旦身份验证成功,原始浏览器窗口将加载包含我将要抓取的信息的页面

起始页中的链接源代码片段为:

<tr>
<!-- onClick="return launchWebStart('authenticate');" -->
    <td><a href="javascript:void(0)" id="webstart-authenticate" ><font size="5">Authenticate with Java Web Start (new method)</font></a>
</tr>
从上面的代码中打印出来的是来自起始网页的html,而不是预期的jnlp文件。控制台还每隔3秒钟从javascript WebConsole打印一次状态更新(至少如果我让代码等待足够长的时间),因此我知道javascript发生了一些事情(函数launchWebStart和followMediator位于单独的javascript文件WebStart.js中):

我还尝试使用CollectionAttachmentHandler对象,如所述:

import java.io.IOException;
导入java.net.MalformedURLException;
导入java.util.List;
导入com.gargoylesoftware.htmlunit.*;
导入com.gargoylesoftware.htmlunit.attachment.attachment;
导入com.gargoylesoftware.htmlunit.attachment.CollectingAttachmentHandler;
导入com.gargoylesoftware.htmlunit.html.HtmlAnchor;
导入com.gargoylesoftware.htmlunit.html.HtmlPage;
公共类Test2{
公共静态void main(字符串[]args)引发FailingHttpStatusCodeException、MalformDurException、IOException{
WebClient WebClient=新的WebClient(BrowserVersion.FIREFOX\u 45);
//打开起始网页
HtmlPage=webClient.getPage(“https://ppair.uspto.gov/TruePassWebStart/AuthenticationChooser.html");
//链接所在元素的id
字符串linkID=“webstart验证”;
//确定合适的锚
HtmlAnchor锚点=(HtmlAnchor)page.getElementById(linkID);
CollectioningAttachmentHandler attachmentHandler=新的CollectionAttachmentHandler();
webClient.setAttachmentHandler(attachmentHandler);
attachmentHandler.handleAttachment(anchor.click());
List attachments=attachmentHandler.getCollectedAttachments();
int i=0;
而(i

此代码还打印出起始网页的内容。所以其他的解决方案似乎都不适合我。我不知道我做错了什么。我已经没有办法让它工作了(我想这很容易!)任何建议都非常感谢

这是一个基于Test2的工作版本

    WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);

    // open starting webpage
    HtmlPage page = webClient.getPage("https://ppair.uspto.gov/TruePassWebStart/AuthenticationChooser.html");

    // id of the element where the link is
    String linkID = "webstart-authenticate";

    // identify the appropriate anchor
    HtmlAnchor anchor = (HtmlAnchor) page.getElementById(linkID);

    CountDownLatch latch = new CountDownLatch(1);
    webClient.setWebStartHandler(new WebStartHandler(){

        @Override
        public void handleJnlpResponse(WebResponse webResponse)
        {
            System.out.println("downloading...");
            try (FileOutputStream fos = new FileOutputStream("/Users/Franklyn/Downloads/uspto-auth.authenticate2.jnlp"))
            {
                IOUtils.copy(webResponse.getContentAsStream(),fos);
            } catch (IOException e)
            {
                throw new RuntimeException(e);
            }
            System.out.println("downloaded");
            latch.countDown();
        }
    });

    anchor.click();
    latch.await();//wait downloading to finish

    webClient.close();
那么为什么您的Test2不起作用呢?因为响应的内容类型对应的下载文件是application/x-java-jnlp-file,所以您需要使用WebStartHandler。如果响应头包含一个名为“Content Disposition”的头,并且其值以“attachment”开头,那么Test2可能会正常工作

Nov 21, 2016 2:53:25 PM com.gargoylesoftware.htmlunit.WebConsole info
INFO: launchWebStart

Nov 21, 2016 2:53:25 PM com.gargoylesoftware.htmlunit.WebConsole info
INFO: followMediator

Nov 21, 2016 2:53:25 PM com.gargoylesoftware.htmlunit.WebConsole info
INFO: responseReceived:200
WAIT

Nov 21, 2016 2:53:25 PM com.gargoylesoftware.htmlunit.WebConsole info
INFO: mediatorCallback: next wait
import java.io.IOException;
import java.net.MalformedURLException;
import java.util.List;

import com.gargoylesoftware.htmlunit.*;
import com.gargoylesoftware.htmlunit.attachment.Attachment;
import com.gargoylesoftware.htmlunit.attachment.CollectingAttachmentHandler;
import com.gargoylesoftware.htmlunit.html.HtmlAnchor;
import com.gargoylesoftware.htmlunit.html.HtmlPage;

public class Test2 {

    public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException {
        WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);

        // open starting webpage
        HtmlPage page = webClient.getPage("https://ppair.uspto.gov/TruePassWebStart/AuthenticationChooser.html");

        // id of the element where the link is
        String linkID = "webstart-authenticate";

        // identify the appropriate anchor
        HtmlAnchor anchor = (HtmlAnchor) page.getElementById(linkID);

        CollectingAttachmentHandler attachmentHandler = new CollectingAttachmentHandler();
        webClient.setAttachmentHandler(attachmentHandler);
        attachmentHandler.handleAttachment(anchor.click());
        List<Attachment> attachments = attachmentHandler.getCollectedAttachments();

        int i = 0;
        while (i < attachments.size()) {
            Attachment attachment = attachments.get(i);
            Page attachedPage = attachment.getPage();
            WebResponse attachmentResponse = attachedPage.getWebResponse();
            String content = attachmentResponse.getContentAsString();
            System.out.println(content);
            i++;
        }
        webClient.close();
    }
}
    WebClient webClient = new WebClient(BrowserVersion.FIREFOX_45);

    // open starting webpage
    HtmlPage page = webClient.getPage("https://ppair.uspto.gov/TruePassWebStart/AuthenticationChooser.html");

    // id of the element where the link is
    String linkID = "webstart-authenticate";

    // identify the appropriate anchor
    HtmlAnchor anchor = (HtmlAnchor) page.getElementById(linkID);

    CountDownLatch latch = new CountDownLatch(1);
    webClient.setWebStartHandler(new WebStartHandler(){

        @Override
        public void handleJnlpResponse(WebResponse webResponse)
        {
            System.out.println("downloading...");
            try (FileOutputStream fos = new FileOutputStream("/Users/Franklyn/Downloads/uspto-auth.authenticate2.jnlp"))
            {
                IOUtils.copy(webResponse.getContentAsStream(),fos);
            } catch (IOException e)
            {
                throw new RuntimeException(e);
            }
            System.out.println("downloaded");
            latch.countDown();
        }
    });

    anchor.click();
    latch.await();//wait downloading to finish

    webClient.close();