Javascript HtmlUnit:在AJAX页面上加载元素

Javascript HtmlUnit:在AJAX页面上加载元素,javascript,java,ajax,web-scraping,htmlunit,Javascript,Java,Ajax,Web Scraping,Htmlunit,我不熟悉Java和HtmlUnit,正在尝试从通过AJAX调用加载这些更新的页面中获取新闻更新。无论我在做什么,更新都没有加载。我错过了什么 我尝试了几种等待JS脚本完成的方法,但都没有成功。单击按钮加载更多新闻或触发他们的事件似乎也没有帮助 我一直在这样的假设下工作,即在JS脚本完成后,我不需要重新分配我的页面实例。是这样吗 我还读到HtmlUnit的JS引擎在一些网站上运行得不太好。是这样还是我只是遗漏了什么 谢谢你的帮助 这是我的密码: import com.gargoylesoftwar

我不熟悉Java和HtmlUnit,正在尝试从通过AJAX调用加载这些更新的页面中获取新闻更新。无论我在做什么,更新都没有加载。我错过了什么

我尝试了几种等待JS脚本完成的方法,但都没有成功。单击按钮加载更多新闻或触发他们的事件似乎也没有帮助

我一直在这样的假设下工作,即在JS脚本完成后,我不需要重新分配我的
页面
实例。是这样吗

我还读到HtmlUnit的JS引擎在一些网站上运行得不太好。是这样还是我只是遗漏了什么

谢谢你的帮助

这是我的密码:

import com.gargoylesoftware.htmlunit.BrowserVersion;
import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.HtmlButton;
import com.gargoylesoftware.htmlunit.html.HtmlElement;
import com.gargoylesoftware.htmlunit.html.HtmlForm;
import com.gargoylesoftware.htmlunit.html.HtmlInput;
import com.gargoylesoftware.htmlunit.html.HtmlPage;
import java.io.IOException;
import java.util.List;
import org.junit.Assert;

public class ProblemDemo {
    public static void main(String[] args) throws IOException, InterruptedException {
        WebClient webClient = new WebClient(BrowserVersion.FIREFOX_38);
        webClient.getOptions().setThrowExceptionOnScriptError(false);
        webClient.setAjaxController(new NicelyResynchronizingAjaxController());
        webClient.getOptions().setTimeout(10000);
        webClient.setJavaScriptTimeout(10000);
        webClient.getOptions().setJavaScriptEnabled(true);

        // Login procedure
        HtmlPage page = webClient.getPage("https://login.xing.com/login");

        final HtmlForm form = (HtmlForm) page.getElementById("login-form");
        final HtmlInput userID = form.getInputByName("login_form[username]");
        final HtmlInput password = form.getInputByName("login_form[password]");
        final HtmlButton submit = form.getButtonByName("button");
        final HtmlInput remember = form.getInputByName("login_form[perm]");

        userID.setValueAttribute("user");
        password.setValueAttribute("pass");
        remember.setChecked(true);
        page = submit.click();

        Assert.assertEquals("Start | XING", page.getTitleText());

        //Navigate to page to be scraped
        page = webClient.getPage(
                "https://www.xing.com/companies/deutschepostag/updates");
        webClient.waitForBackgroundJavaScript(10*1000);
        System.out.println(page.getUrl().toString());
        System.out.println(page.asXml());

        //Print number of employees (works, not dynamic)
        HtmlElement result = page.getFirstByXPath("//div[@id='profile-nav-tabs']"
                + "/ul/li[@id='employees-tab']/a");
        System.out.println("Employees: " + result.getTextContent());

        //Print news (doesn't work)
        String news;
        List<HtmlElement> results = (List<HtmlElement>) page.getByXPath("//div"
                + "[@id='company-updates']/ul[@id='news-feed']/li/div"
                + "[@class='activity-content']");
        System.out.println("News found: " + results.size());
        for(HtmlElement item : results){
            news = "";
            System.out.println("            NEW ITEM");
            System.out.println(item.getTextContent());
        }
    }
}

setThroweExceptionOnScriptError
设置为
false
可防止您看到错误

编辑:Latest包含对
performance.navigation.redirectCount的修复程序


请尝试并还原

您好,当答案被修改时,您是否有机会测试最新快照?
WARNING: Obsolete content type encountered: 'text/javascript'.