Java 如何使用jsoup从html文件中的强标记检索数据?
我有一些html数据,比如Java 如何使用jsoup从html文件中的强标记检索数据?,java,html,jsoup,Java,Html,Jsoup,我有一些html数据,比如 <div class="bs-example"> <div class="panel panel-primary"> <div class="panel-heading"> <h3 class="panel-title">ABC</h3> </div> <div class="panel-body"> <div class="s
<div class="bs-example">
<div class="panel panel-primary">
<div class="panel-heading">
<h3 class="panel-title">ABC</h3>
</div>
<div class="panel-body">
<div class="slimScroller" style="height:280px; position: relative;" data-rail-visible="1" data-always-visible="1">
<strong>Name:</strong>
<a href="https://ABC"> </a><br />
<strong>ID No:</strong> XXXXX<br />
<strong>Status:</strong> ACTIVE<br />
<strong>Class:</strong> 5<br />
<strong>Category:</strong> A<br />
<strong>Marks:</strong> 500<br />
</div>
</div>
</div>
</div>
如何使用jsoup或任何其他方式获取此数据?请提供帮助。您可以使用Element.nextElementSibling()或/和Element.nextSibling()获得所需的输出
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class Exam {
public static void main(String[] args) {
String html = "<div class=\"bs-example\">" +
" <div class=\"panel panel-primary\">" +
" <div class=\"panel-heading\">" +
" <h3 class=\"panel-title\">ABC</h3>" +
" </div>" +
" <div class=\"panel-body\">" +
" <div class=\"slimScroller\" style=\"height:280px; position: relative;\" data-rail-visible=\"1\" data-always-visible=\"1\">" +
" <strong>Name:</strong>" +
" <a href=\"https://ABC\"> </a><br />" +
" <strong>ID No:</strong> XXXXX<br />" +
" <strong>Status:</strong> ACTIVE<br />" +
" <strong>Class:</strong> 5<br />" +
" <strong>Category:</strong> A<br />" +
" <strong>Marks:</strong> 500<br />" +
" </div>" +
" </div>" +
" </div>" +
"</div>";
Document doc = Jsoup.parse(html);
Elements eles = doc.select("div.slimScroller strong");
for(Element e :eles)
System.out.println(e.text() +
( e.nextElementSibling().tagName().equals("a")?
e.nextElementSibling().attr("href").replace("https://", ""):
e.nextSibling().toString()));
}
}
import org.jsoup.jsoup;
导入org.jsoup.nodes.Document;
导入org.jsoup.nodes.Element;
导入org.jsoup.select.Elements;
公开课考试{
公共静态void main(字符串[]args){
字符串html=“”+
" " +
" " +
“ABC”+
" " +
" " +
" " +
“名称:”+
“
”+
“ID号:XXXXX
”+
“状态:活动
”+
“类:5
”+
“类别:A
”+
“标记:500
”+
" " +
" " +
" " +
"";
Document doc=Jsoup.parse(html);
Elements eles=doc.select(“div.slimScroller-strong”);
用于(元素e:元素e)
System.out.println(e.text()+
(e.nextElementSibling().tagName().equals(“a”)?
e、 nextElementSibling().attr(“href”).replace(“https:/”,“”):
e、 nextSibling().toString());
}
}
您可以使用Element.nextElementSibling()或/和Element.nextSibling()获得所需的输出
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class Exam {
public static void main(String[] args) {
String html = "<div class=\"bs-example\">" +
" <div class=\"panel panel-primary\">" +
" <div class=\"panel-heading\">" +
" <h3 class=\"panel-title\">ABC</h3>" +
" </div>" +
" <div class=\"panel-body\">" +
" <div class=\"slimScroller\" style=\"height:280px; position: relative;\" data-rail-visible=\"1\" data-always-visible=\"1\">" +
" <strong>Name:</strong>" +
" <a href=\"https://ABC\"> </a><br />" +
" <strong>ID No:</strong> XXXXX<br />" +
" <strong>Status:</strong> ACTIVE<br />" +
" <strong>Class:</strong> 5<br />" +
" <strong>Category:</strong> A<br />" +
" <strong>Marks:</strong> 500<br />" +
" </div>" +
" </div>" +
" </div>" +
"</div>";
Document doc = Jsoup.parse(html);
Elements eles = doc.select("div.slimScroller strong");
for(Element e :eles)
System.out.println(e.text() +
( e.nextElementSibling().tagName().equals("a")?
e.nextElementSibling().attr("href").replace("https://", ""):
e.nextSibling().toString()));
}
}
import org.jsoup.jsoup;
导入org.jsoup.nodes.Document;
导入org.jsoup.nodes.Element;
导入org.jsoup.select.Elements;
公开课考试{
公共静态void main(字符串[]args){
字符串html=“”+
" " +
" " +
“ABC”+
" " +
" " +
" " +
“名称:”+
“
”+
“ID号:XXXXX
”+
“状态:活动
”+
“类:5
”+
“类别:A
”+
“标记:500
”+
" " +
" " +
" " +
"";
Document doc=Jsoup.parse(html);
Elements eles=doc.select(“div.slimScroller-strong”);
用于(元素e:元素e)
System.out.println(e.text()+
(e.nextElementSibling().tagName().equals(“a”)?
e、 nextElementSibling().attr(“href”).replace(“https:/”,“”):
e、 nextSibling().toString());
}
}
以下代码应根据您对a
标记的注释提供指定的输出:
private static void printStudentInfo(Document document){
Elements students = document.select("div.slimScroller strong");
for(Element student : students){
System.out.print(student.text());
System.out.println(student.nextElementSibling().tagName().equals("a") ?
student.nextElementSibling().text() : student.nextSibling().toString());
}
}
以下代码应提供根据您的注释指定的输出,该注释描述了
a
标记的使用方式:
private static void printStudentInfo(Document document){
Elements students = document.select("div.slimScroller strong");
for(Element student : students){
System.out.print(student.text());
System.out.println(student.nextElementSibling().tagName().equals("a") ?
student.nextElementSibling().text() : student.nextSibling().toString());
}
}
您好,厄立特里亚人,非常感谢您的帮助……但这会在(e.nextElementSibling().tagName().equals(“a”)处引发空指针异常?您是否在原始html代码段或@anurag编辑的代码段中尝试过此操作?您第一次发布的代码段中可能缺少某些内容。对于代码中包含的html,我没有获得NPE。我在原始代码段中尝试过此操作。我的“a”标记的内容如下:Name:ABCHi Eritrean,非常感谢您的帮助……但这会在(e.nextElementSibling().tagName().equals(“a”)处引发空指针异常?您是否在原始html代码段或@anurag编辑的代码段中尝试过此操作?您第一次发布的代码段中可能缺少某些内容。对于代码中包含的html,我没有获得NPE。我在原始代码段中尝试过此操作。我的“a”标记显示为:名称:ABC