Java 网络爬虫Amazon获取span元素_Java_Web Crawler_Jsoup

Java 网络爬虫Amazon获取span元素

java web-crawler

Java 网络爬虫Amazon获取span元素,java,web-crawler,jsoup,Java,Web Crawler,Jsoup,我正在搜索亚马逊的分类，我得到了salesrank和产品URL。现在我想对类别进行爬网，并从类别范围中获取所有信息 <span class="zg_hrsr_ladder">in <a href="https://www.amazon.de/gp/bestsellers/books/ref=pd_zg_hrsr_b_1_1">Bücher</a> > <a href="https://www.amazon

我正在搜索亚马逊的分类，我得到了salesrank和产品URL。现在我想对类别进行爬网，并从类别范围中获取所有信息

<span class="zg_hrsr_ladder">in&nbsp;<a href="https://www.amazon.de/gp/bestsellers/books/ref=pd_zg_hrsr_b_1_1">B&uuml;cher</a> &gt; <a href="https://www.amazon.de/gp/bestsellers/books/287480/ref=pd_zg_hrsr_b_1_2">Krimis & Thriller</a> &gt; <b><a href="https://www.amazon.de/gp/bestsellers/books/419954031/ref=pd_zg_hrsr_b_1_3_last">Deutschland</a></b></span>

我得到了跨度内的所有东西。但我只想要a href“Bücher”、“Krimis&Thriller”和“Deutschland”中的文本。如何获取此信息？

您希望获取

中的文本以及生成的元素
示例代码
String source = "<span class=\"zg_hrsr_ladder\">in&nbsp;<a href=\"https://www.amazon.de/gp/bestsellers/books/ref=pd_zg_hrsr_b_1_1\">B&uuml;cher</a> &gt; <a href=\"https://www.amazon.de/gp/bestsellers/books/287480/ref=pd_zg_hrsr_b_1_2\">Krimis & Thriller</a> &gt; <b><a href=\"https://www.amazon.de/gp/bestsellers/books/419954031/ref=pd_zg_hrsr_b_1_3_last\">Deutschland</a></b></span>";

Document htmlDocument = Jsoup.parse(source, "UTF-8");

Elements category = htmlDocument.select("span.zg_hrsr_ladder a");

category.forEach(aElement -> {
    System.out.println(aElement.text());
});

不要爬行，而是使用api。。。非常感谢。这对我有帮助！
String source = "<span class=\"zg_hrsr_ladder\">in&nbsp;<a href=\"https://www.amazon.de/gp/bestsellers/books/ref=pd_zg_hrsr_b_1_1\">B&uuml;cher</a> &gt; <a href=\"https://www.amazon.de/gp/bestsellers/books/287480/ref=pd_zg_hrsr_b_1_2\">Krimis & Thriller</a> &gt; <b><a href=\"https://www.amazon.de/gp/bestsellers/books/419954031/ref=pd_zg_hrsr_b_1_3_last\">Deutschland</a></b></span>";

Document htmlDocument = Jsoup.parse(source, "UTF-8");

Elements category = htmlDocument.select("span.zg_hrsr_ladder a");

category.forEach(aElement -> {
    System.out.println(aElement.text());
});

Bücher
Krimis & Thriller
Deutschland