如何在java中使用html解析器对div的内容进行筛选
我正在尝试使用HTMLPasser库解析HTML字符串。 html如下所示:如何在java中使用html解析器对div的内容进行筛选,java,html,filter,html-parsing,Java,Html,Filter,Html Parsing,我正在尝试使用HTMLPasser库解析HTML字符串。 html如下所示: <body> <div class="Level1"> <div class="row"> <div class="txt"> Date of analysis: </div><div class="conten
<body>
<div class="Level1">
<div class="row">
<div class="txt">
Date of analysis:
</div><div class="content">
02/03/11
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Site:
</div><div class="content">
13.0E
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Network type:
</div><div class="content">
DVB-S
</div>
</div>
</div>
</body>
NodeList nl = parser.parse(null); // you can also filter here
NodeList divs = nl.extractAllNodesThatMatch(
new AndFilter(new TagNameFilter("DIV"),
new HasAttributeFilter("class", "txt")));
if( divs.size() > 0 ) {
Tag div = divs.elementAt(0);
String text = div.getText(); // this is the text of the div
}
有人能帮我解决这个问题吗??如何在div中读取div?
谢谢
最好这样做:
<body>
<div class="Level1">
<div class="row">
<div class="txt">
Date of analysis:
</div><div class="content">
02/03/11
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Site:
</div><div class="content">
13.0E
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Network type:
</div><div class="content">
DVB-S
</div>
</div>
</div>
</body>
NodeList nl = parser.parse(null); // you can also filter here
NodeList divs = nl.extractAllNodesThatMatch(
new AndFilter(new TagNameFilter("DIV"),
new HasAttributeFilter("class", "txt")));
if( divs.size() > 0 ) {
Tag div = divs.elementAt(0);
String text = div.getText(); // this is the text of the div
}
最好这样做:
<body>
<div class="Level1">
<div class="row">
<div class="txt">
Date of analysis:
</div><div class="content">
02/03/11
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Site:
</div><div class="content">
13.0E
</div>
</div>
</div><div class="Level1">
<div class="row">
<div class="txt">
Network type:
</div><div class="content">
DVB-S
</div>
</div>
</div>
</body>
NodeList nl = parser.parse(null); // you can also filter here
NodeList divs = nl.extractAllNodesThatMatch(
new AndFilter(new TagNameFilter("DIV"),
new HasAttributeFilter("class", "txt")));
if( divs.size() > 0 ) {
Tag div = divs.elementAt(0);
String text = div.getText(); // this is the text of the div
}
nl.item(0)
是行
元素,我想我们想要类为txt
的div。我尝试了你的建议,但没有成功…:(我取得的最大成绩是:Parser Parser=Parser.createParser(htmlFile,null);NodeList nl=Parser.extractAllNodesThatMatch(new-AndFilter(new-TagNameFilter(“DIV”)、new-HasAttributeFilter(“class”、“txt”);for(int I=0;Inl.item(0)
是行
元素,我想我们想要类为txt
的div。我尝试了你的建议,但没有起作用…:(我获得的最大值是:Parser Parser=Parser.createParser(htmlFile,null);NodeList nl=Parser.extractAllNodesThatMatch(new and and filter)(new TagNameFilter(“div”),new HasAttributeFilter(代码的结果是:div class=“txt”div class=“txt”div class=“txt”div class=“txt”它没有打印div的值。您知道如何打印该值吗?