Java 使用Jsoup库从给定表获取数据的Web抓取
所以我试图从网页上抓取一些数据,但无法做到这一点。我试着用substring()来做,但效率很低。下面是我编写的部分代码:Java 使用Jsoup库从给定表获取数据的Web抓取,java,web-scraping,jsoup,Java,Web Scraping,Jsoup,所以我试图从网页上抓取一些数据,但无法做到这一点。我试着用substring()来做,但效率很低。下面是我编写的部分代码: Elements links; Element link; String url = "https://www.premierleague.com/tables"; Document document = Jsoup.connect(url).get();
Elements links;
Element link;
String url = "https://www.premierleague.com/tables";
Document document = Jsoup.connect(url).get();
links = document.select("table");
org.jsoup.nodes.Element table = document.select("table").get(0);
Elements rows = table.select("tr");
org.jsoup.nodes.Element row = rows.get(1);
Elements cols = row.select("td");
有谁能帮我从同一个链接中举几个例子吗
String url = "https://www.premierleague.com/tables";
Document doc = Jsoup.connect(url).get();
Element table = doc.select("table").first();
Iterator<Element> team = table.select("td[class=team]").iterator();
Iterator<Element> rank = table.select("td[id=tooltip]").iterator();
Iterator<Element> points = table.select("td[class=points]").iterator();
System.out.println(team.next().text());
System.out.println(rank.next().text());
System.out.println(points.next().text());
编辑:
回答你的问题:
System.out.println(team.next().text());
System.out.println(rank.next().text());
System.out.println(points.next().text());
team.next();
team.next();
team.next();
rank.next();
rank.next();
rank.next();
points.next();
points.next();
points.next();
System.out.println(team.next().text());
System.out.println(rank.next().text());
System.out.println(points.next().text());
输出:
ChelseaCHE
1 Previous Position 1
46
ChelseaCHE
1 Previous Position 1
46
Tottenham HotspurTOT
5 Previous Position 5
33
编辑:
回答你的问题:
System.out.println(team.next().text());
System.out.println(rank.next().text());
System.out.println(points.next().text());
team.next();
team.next();
team.next();
rank.next();
rank.next();
rank.next();
points.next();
points.next();
points.next();
System.out.println(team.next().text());
System.out.println(rank.next().text());
System.out.println(points.next().text());
输出:
ChelseaCHE
1 Previous Position 1
46
ChelseaCHE
1 Previous Position 1
46
Tottenham HotspurTOT
5 Previous Position 5
33
比如说,排名为1的团队的职位、团队和分数,比如排名为1的团队的职位、团队和分数,但是如果我需要单独的数据,我该怎么办。。比如说,排名第五,然后是第九的球队等等……我做到了。。此外,我无法刮除其他没有像赢、画、输、GD等类的字段。你也能提供一些见解吗?@codeploler在代码中你应该更改td或tr,并添加任何特定于它的内容,重新确认,像这样你就可以scappe itExample了吗?Khalil,但是如果我需要单独的数据,我该怎么办呢。。比如说,排名第五,然后是第九的球队等等……我做到了。。此外,我无法刮除其他没有像赢、画、输、GD等类的字段。你也能提供一些见解吗?@codeploler在代码中你应该更改td或tr,并添加任何特定于它的内容,以便重新确认,像这样你就可以scarpe itExample请?