HTMLCleaner和XPath
HTMLCleaner是否支持XPath position()函数和使用谓词表示位置 我的代码如下:HTMLCleaner和XPath,xpath,predicate,htmlcleaner,Xpath,Predicate,Htmlcleaner,HTMLCleaner是否支持XPath position()函数和使用谓词表示位置 我的代码如下: HtmlCleaner htmlCleaner = new HtmlCleaner(); String sourceUrl = "http://jobs.alaska.gov/RR/WARN_notices.htm"; URL url = new URL(sourceUrl); URLConnection urlConnection = url.openConnection(); TagNode
HtmlCleaner htmlCleaner = new HtmlCleaner();
String sourceUrl = "http://jobs.alaska.gov/RR/WARN_notices.htm";
URL url = new URL(sourceUrl);
URLConnection urlConnection = url.openConnection();
TagNode rootTagNode = htmlCleaner.clean(new InputStreamReader(urlConnection.getInputStream()));
String xpathOne = "//table[2]/tbody/tr/td/table/tbody/tr/td/table/tbody/tr[1]/td/div/span/text()";
// String xpathTwo = "//table[2]/tbody/tr/td/table/tbody/tr/td/table/tbody/tr[3]/td/div/span/text()";
Object[] xPathNodes = rootTagNode.evaluateXPath(xpathOne);
// Object[] xPathNodes = rootTagNode.evaluateXPath(xpathTwo);
for(Object object : xPathNodes) {
System.out.println(object);
}
xPathOne正确执行并返回带有标题的表行。xPathTwo不返回任何内容,但它应该返回表中的第一行数据。任何帮助都将不胜感激。谢谢 我认为那里没有
span
元素,所以也许缩短到//table[2]/tbody/tr/td/table/tbody/tr/td/table/tbody/tr[3]/td/div/text()的路径是你想要的。太好了,这是我的疏忽。非常感谢你发现了这一点。真的很感激。