Java 在HTMLunit WebClient中跨页面维护登录凭据_Java_Webclient_Htmlunit

Java 在HTMLunit WebClient中跨页面维护登录凭据

java

Java 在HTMLunit WebClient中跨页面维护登录凭据,java,webclient,htmlunit,Java,Webclient,Htmlunit,我的问题与上一个问题非常相似，只是我无法访问远程服务器，也不知道它如何进行身份验证我正在尝试在可以使用webclient.getPage（）请求的网页上保持登录状态。我访问的网站使用带有用户名、密码对的标准登录表单。我之前所做的是创建一个小函数来为我实现这一点： public static HtmlPage logIn(HtmlPage page) { HtmlPage nextpage = null; final HtmlForm form = page.getFormByN

我的问题与上一个问题非常相似，只是我无法访问远程服务器，也不知道它如何进行身份验证

我正在尝试在可以使用webclient.getPage（）请求的网页上保持登录状态。我访问的网站使用带有用户名、密码对的标准登录表单。我之前所做的是创建一个小函数来为我实现这一点：

public static HtmlPage logIn(HtmlPage page) {
    HtmlPage nextpage = null;
    final HtmlForm form = page.getFormByName("login_form");
    final HtmlSubmitInput button = form.getInputByValue("Login");
    final HtmlTextInput username = form.getInputByName("username");
    final HtmlPasswordInput password = form.getInputByName("password");
    username.setValueAttribute("user_foo");
    password.setValueAttribute("pwd_bar");      

    // hit submit button and return the requested page
    try {
        nextpage = button.click();
    } catch (IOException e) {
        e.printStackTrace();
    }
        return nextpage;
}

问题是，我必须手动搜索此函数返回的页面，才能找到指向所需页面的链接。更麻烦的是，这只适用于登录后的页面，而不适用于其他页面

相反，我希望将登录信息保存在浏览器模拟器“webclient”中，这样我就可以无缝地访问站点中任何受保护的页面。除了尝试上一个问题（如上链接）中的解决方案外，我还尝试了以下解决方案，但没有成功：

private static void setCredentials(WebClient webClient) {
    String username = "user_foo";
    String password = "pwd_bar";
    DefaultCredentialsProvider creds = (DefaultCredentialsProvider) webClient.getCredentialsProvider();//new DefaultCredentialsProvider();
    try {
        creds.addCredentials(username, password);
    webClient.setCredentialsProvider(creds);
    }
    catch (Exception e){
        System.out.println("!!! Problem login in");
        e.printStackTrace();            
}

编辑：以下是显示我如何使用webClient的主要功能：

公共静态void main（字符串[]args）引发异常{

    // Create and initialize WebClient object
    WebClient webClient = new WebClient(/*BrowserVersion.CHROME_16*/);
    webClient.setThrowExceptionOnScriptError(false);
    webClient.setJavaScriptEnabled(false);
    webClient.setCssEnabled(false);
    webClient.getCookieManager().setCookiesEnabled(true);
    setCredentials(webClient);

    HtmlPage subj_page = null;
    //visit login page and get it
    String url = "http://www.website.com/index.php";
    HtmlPage page = (HtmlPage) webClient.getPage(url);
    HtmlAnchor anchor = null;
    page = logIn(page);

    // search for content
    page = searchPage(page, "recent articles");     

    // click on the paper link      
    anchor = (HtmlAnchor) page.getAnchorByText("recent articles");
    page = (HtmlPage) anchor.click();

    // loop through found articles
    //{{{page
    int curr_pg = 1;
    int last_pg = 5;        
    page = webClient.getPage(<starting URL of the first article>); // such URLs look like: "www.website.com/view_articles.php?publication_id=17&page=1"
    do {
        // find sections on this page; 
        List <HtmlDivision> sections = new ArrayList<HtmlDivision>();
        List <HtmlDivision> artdivs = new ArrayList<HtmlDivision>();
        List <HtmlDivision> tagdivs = new ArrayList<HtmlDivision>();
        sections = (List<HtmlDivision>) page.getByXPath("//div[@class='article_section']");
        artdivs = (List<HtmlDivision>) page.getByXPath("//div[@class='article_head']");
        tagdivs = (List<HtmlDivision>) page.getByXPath("//div[@class='article_tag']");          

        int num_ques = sections.size();
        HtmlDivision section, artdiv, tagdiv;

        // for every section, get its sub-articles
        for (int i = 0; i < num_ques; i++) {
            section = sections.get(i);
            artdiv = artdivs.get(i);
            tagdiv = tagdivs.get(i);

            // find the sub-article details and print to xml file
            String xml = getXMLArticle(artdiv, section.asText(), tagdiv);
            System.out.println(xml);
            System.out.println("-----------------------------");
        }
        //remove IllegalMonitorStateException *
        synchronized (webClient) { 
            webClient.wait(2000); // wait for 2 seconds             
        }

        String href = "?publication_id=17&page=" + curr_pg;
        anchor = page.getAnchorByHref(href);            
        page = anchor.click();
        System.out.println("anchor val: " + anchor.getHrefAttribute());
        curr_pg++;
    } while (curr_pg < last_pg);
    //}}}page

    webClient.closeAllWindows();
}

//创建并初始化WebClient对象
WebClient WebClient=新的WebClient（/*BrowserVersion.CHROME_16*/）；
webClient.SetThroweExceptionOnScriptError（false）；
webClient.setJavaScriptEnabled（false）；
webClient.setCssEnabled（false）；
webClient.getCookieManager（）.setCookiesEnabled（true）；
setCredentials（网络客户端）；
HtmlPage subc_page=null；
//访问登录页面并获取它
字符串url=”http://www.website.com/index.php";
HtmlPage=（HtmlPage）webClient.getPage（url）；
HtmlAnchor anchor=null；
页面=登录（第页）；
//搜索内容
页面=搜索页面（第页，“最近的文章”）；
//点击纸张链接
anchor=（HtmlAnchor）page.getAnchorByText（“最近的文章”）；
页面=（HtmlPage）锚定。单击（）；
//循环浏览找到的文章
//{{{{页
int curr_pg=1；
int last_pg=5；
page=webClient.getPage（）；//这样的URL看起来像：“www.website.com/view_articles.php？publication_id=17&page=1”
做{
//在本页上查找部分；
List sections=new ArrayList（）；
List artdivs=new ArrayList（）；
List tagdivs=new ArrayList（）；
sections=（List）page.getByXPath（“//div[@class='article\u section']”）；
artdivs=（List）page.getByXPath（//div[@class='article\u head']）；
tagdivs=（列表）page.getByXPath（“//div[@class='article\u tag']”）；
int num_ques=sections.size（）；
塔格迪夫艺术部HtmlDivision科；
//对于每个部分，获取其子文章
对于（int i=0；i


其他信息：我没有关于远程站点服务器的身份验证机制的信息，因为我无法访问它，但是您的帮助会很好。谢谢大家!
 登录后，您如何访问其他页面，以及登录时会发生什么？向我们展示代码。@JBNizet，刚刚发布了上面的主代码。getXMLArticle（）的实现并不重要，因为它不需要webClient。如果您需要任何其他信息，请告诉我：）