Java 如何从特定站点JSOUP中刮取表数据_Java_Jsoup_Screen Scraping

Java 如何从特定站点JSOUP中刮取表数据

java

Java 如何从特定站点JSOUP中刮取表数据,java,jsoup,screen-scraping,Java,Jsoup,Screen Scraping,我正试图从这个站点的表中获取一些数据：这是我尝试过的scraper的源代码 public static void main(String[] args) throws Exception { String url = "https://www.worldometers.info/coronavirus/"; try{ Document doc = Jsoup.connect(url).get(); Element table = doc.ge

我正试图从这个站点的表中获取一些数据：

这是我尝试过的scraper的源代码

 public static void main(String[] args) throws Exception {

    String url = "https://www.worldometers.info/coronavirus/";
    try{
        Document doc = Jsoup.connect(url).get();
        Element table = doc.getElementById("main_table_countries_today");
        Elements rows = table.getElementsByTag("tr");

        for(Element row : rows){
            Elements tds = row.getElementsByTag("td");

            for(int i = 0;i<tds.size();i++){
                System.out.println(tds.get(i).text());
            }
        }

    }catch (IOException e){
        e.printStackTrace();
    }
}

publicstaticvoidmain（字符串[]args）引发异常{
字符串url=”https://www.worldometers.info/coronavirus/";
试一试{
Document doc=Jsoup.connect（url.get（）；
Element table=doc.getElementById（“主表”即今天的国家）；
元素行=table.getElementsByTag（“tr”）；
用于（元素行：行）{
元素tds=row.getElementsByTag（“td”）；
对于（int i=0；i网站是如何组织的？如果您使用浏览器访问，您可以选择一个特定的国家吗？一旦您可以手动执行，就不难实现自动化。另一方面，如果您只想获取可用数据的子集，编写代码忽略您不想要的数据应该很简单，但需要了解如何操作网站的HTML是有组织的。告诉我们你有什么以及它是怎么错的。@tripleee是的，我可以选择像中国这样的一些国家，它会打开一个页面，我可以使用getElementsByClass选项轻松地从div中刮取数据。这只适用于中国和其他几个国家。如果我想刮取阿尔及利亚的数据，我必须通过scraping表，这就是我被卡住的地方。在表内有一个类为“偶数”或“奇数”的tr而且它包含的tds也很少。关于堆栈溢出的适当问题将显示一些示例数据和您已经尝试过的内容的概述。我不是一个Java人，但您所描述的内容对于任何半正式的HTML解析器都应该简单明了。