Java 如何用Jsoup填写表单?
我正在尝试导航到加州网站的描述页面。但是不能去 然后,我有一个html表单,我在上面提交请求, 我无法在这里添加表单,但它只是一个带有所需参数的Java 如何用Jsoup填写表单?,java,web-scraping,jsoup,Java,Web Scraping,Jsoup,我正在尝试导航到加州网站的描述页面。但是不能去 然后,我有一个html表单,我在上面提交请求, 我无法在这里添加表单,但它只是一个带有所需参数的POST请求 我能够从我来到这里的前一页中获取\uuu EVENTTARGET和\uu EVENTARGUMENT 我做错了什么 代码: String url = "kepler.sos.ca.gov/"; Connection.Response resp = Jsoup.connect(url)
POST
请求
我能够从我来到这里的前一页中获取\uuu EVENTTARGET
和\uu EVENTARGUMENT
我做错了什么
代码:
String url = "kepler.sos.ca.gov/";
Connection.Response resp = Jsoup.connect(url)
.timeout(30000)
.method(Connection.Method.GET)
.execute();
Document responseDocument = resp.parse();
Map<String, String> loginCookies = resp.cookies();
eventValidation=responseDocument.select("input[name=__EVENTVALIDATION]").first();
viewState = responseDocument.select("input[name=__VIEWSTATE]").first();
String url=“kepler.sos.ca.gov/”;
Connection.Response resp=Jsoup.connect(url)
.超时(30000)
.method(Connection.method.GET)
.execute();
Document responseDocument=resp.parse();
Map loginCookies=resp.cookies();
eventValidation=responseDocument.select(“输入[name=\uu eventValidation]”)。first();
viewState=responseDocument.select(“输入[name=\u viewState]”)。first();
您要使用的。这是Jsoup的一个有用特性。它能够找到表单中声明的字段,并为您发布它们。在发布表单之前,可以使用Jsoup API设置字段的值
注:
在下面的示例代码中,您总是会看到对方法的调用,然后是对方法的调用
例如:
响应文档。选择(“表单#aspnetForm”).first()
Jsoup引入了一个更有效的替代方案:。您可以将其用作原始替代品的直接替代品
例如:响应文档。选择(“表单#aspnetForm”).first()
可替换为
响应文档。选择第一个(“表单#aspnetForm”)
示例代码
输出(截至本文撰写时)
另见:
在本例中,我们将使用类登录网站
所有表单数据都由FormElement类处理(甚至表单方法检测)。调用该方法时会生成一个现成的。我们所要做的就是用附加标题(cookies、用户代理等)完成此连接并执行它。这与上面在接受的答案中发布的代码完全相同,只是它反映了加利福尼亚州在发布原始答案后对其网站所做的更改。所以在我写这篇文章时,这段代码是有效的。我已经更新了原始评论,确定了任何更改
// * Connect to website (Orignal url: http://kepler.sos.ca.gov/)
String url = "https://businesssearch.sos.ca.gov/";
Connection.Response resp = Jsoup.connect(url) //
.timeout(30000) //
.method(Connection.Method.GET) //
.execute();
// * Find the form (Original jsoup selector: from#aspnetForm)
Document responseDocument = resp.parse();
Element potentialForm = responseDocument.select("form#formSearch").first();
checkElement("form element", potentialForm);
FormElement form = (FormElement) potentialForm;
// * Fill in the form and submit it
// ** Search Type (Original jsoup selector: name$=RadioButtonList_SearchType)
Element radioButtonListSearchType = form.select("name$=SearchType]").first();
checkElement("search type radio button list", radioButtonListSearchType);
radioButtonListSearchType.attr("checked", "checked");
// ** Name search (Original jsoup selector: name$=TextBox_NameSearch)
Element textBoxNameSearch = form.select("[name$=SearchCriteria]").first();
checkElement("name search text box", textBoxNameSearch);
textBoxNameSearch.val("cali");
// ** Submit the form
Document searchResults = form.submit().cookies(resp.cookies()).post();
// * Extract results (entity numbers in this sample code, orignal jsoup selector: id$=SearchResults_Corp)
for (Element entityNumber : searchResults.select("table[id$=enitityTable] > tbody > tr > td:first-of-type:not(td[colspan=5])")) {
System.out.println(entityNumber.text());
}
此外,我还可以获得uu EVENTVALIDATION和u VIEWSTATE。请发布您的代码。我无法添加代码,基本上它是jsp上的jsoup以获取所需数据,然后是html表单以发送到加利福尼亚网站String url=“”;Connection.Response resp=Jsoup.connect(url).timeout(30000).method(Connection.method.GET).execute();Document responseDocument=resp.parse();Map loginCookies=resp.cookies();eventValidation=responseDocument.select(“输入[name=\uu eventValidation]”)。first();viewState=responseDocument.select(“输入[name=\u viewState]”)。first();在我看来这没问题,请求的
POST
在哪里?这是我找到的JSoup最好的例子。谢谢。这个答案绝对有帮助!为了改进,我想说的是,在我的例子中,我必须添加按钮的id和名称来点击它。我将.data(“id_按钮”、“name_按钮”)添加到最后一个loginForm.submit()中。对我来说,它是连接。响应loginActionResponse=loginForm.submit()。数据(“id_按钮”、“name_按钮”)。cookies(loginFormResponse.cookies()).userAgent(USER_AGENT)。execute();请参阅下面我的答案,它与此处给出的代码完全相同,只是它反映了加利福尼亚州在发布原始答案后对其网站所做的更改。您能否更清楚地指出您的答案中反映“加利福尼亚州对其网站所做的更改”的差异?
C3036475
C3027305
C3236514
C3027304
C3034012
C3035110
C3028330
C3035378
C3124793
C3734637
// # Constants used in this example
final String USER_AGENT = "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36";
final String LOGIN_FORM_URL = "https://github.com/login";
final String USERNAME = "yourUsername";
final String PASSWORD = "yourPassword";
// # Go to login page
Connection.Response loginFormResponse = Jsoup.connect(LOGIN_FORM_URL)
.method(Connection.Method.GET)
.userAgent(USER_AGENT)
.execute();
// # Fill the login form
// ## Find the form first...
FormElement loginForm = (FormElement)loginFormResponse.parse()
.select("div#login > form").first();
checkElement("Login Form", loginForm);
// ## ... then "type" the username ...
Element loginField = loginForm.select("#login_field").first();
checkElement("Login Field", loginField);
loginField.val(USERNAME);
// ## ... and "type" the password
Element passwordField = loginForm.select("#password").first();
checkElement("Password Field", passwordField);
passwordField.val(PASSWORD);
// # Now send the form for login
Connection.Response loginActionResponse = loginForm.submit()
.cookies(loginFormResponse.cookies())
.userAgent(USER_AGENT)
.execute();
System.out.println(loginActionResponse.parse().html());
public static void checkElement(String name, Element elem) {
if (elem == null) {
throw new RuntimeException("Unable to find " + name);
}
}
// * Connect to website (Orignal url: http://kepler.sos.ca.gov/)
String url = "https://businesssearch.sos.ca.gov/";
Connection.Response resp = Jsoup.connect(url) //
.timeout(30000) //
.method(Connection.Method.GET) //
.execute();
// * Find the form (Original jsoup selector: from#aspnetForm)
Document responseDocument = resp.parse();
Element potentialForm = responseDocument.select("form#formSearch").first();
checkElement("form element", potentialForm);
FormElement form = (FormElement) potentialForm;
// * Fill in the form and submit it
// ** Search Type (Original jsoup selector: name$=RadioButtonList_SearchType)
Element radioButtonListSearchType = form.select("name$=SearchType]").first();
checkElement("search type radio button list", radioButtonListSearchType);
radioButtonListSearchType.attr("checked", "checked");
// ** Name search (Original jsoup selector: name$=TextBox_NameSearch)
Element textBoxNameSearch = form.select("[name$=SearchCriteria]").first();
checkElement("name search text box", textBoxNameSearch);
textBoxNameSearch.val("cali");
// ** Submit the form
Document searchResults = form.submit().cookies(resp.cookies()).post();
// * Extract results (entity numbers in this sample code, orignal jsoup selector: id$=SearchResults_Corp)
for (Element entityNumber : searchResults.select("table[id$=enitityTable] > tbody > tr > td:first-of-type:not(td[colspan=5])")) {
System.out.println(entityNumber.text());
}