Python 获取Wikipedia表结果_Python_Web Scraping

Python 获取Wikipedia表结果

python web-scraping

Python 获取Wikipedia表结果,python,web-scraping,Python,Web Scraping,我需要在两个不同的数组中收集国家代码顶级域（如ac）和国家（请参见链接：）我试图获得以下我需要的信息： import requests from bs4 import BeautifulSoup as bs with requests.Session() as s: # use session object for efficiency of tcp re-use s.headers = {'User-Agent': 'Mozilla/5.0'} r = s.g

我需要在两个不同的数组中收集国家代码顶级域（如ac）和国家（请参见链接：）

我试图获得以下我需要的信息：

import requests
from bs4 import BeautifulSoup as bs
with requests.Session() as s: # use session object for efficiency of tcp re-use
        s.headers = {'User-Agent': 'Mozilla/5.0'}
        r = s.get('https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains')
        soup = bs(r.content, 'lxml')
        # print(soup.prettify)
tld_table=soup.find('table',{'class':'wikitable sortable'})
links=tld_table.findAll('a')

<>但是，当我考虑<代码>类：WikKeable可排序< /COD>时，它给了我错误的结果。p> 你能帮我弄清楚吗

谢谢

您检查了从请求中获得的HTML吗？@AMC，是的，我使用了google chrome inspect，它应该包含在wikitable排序表中。但我并不完全熟悉html和网页抓取是的，我使用了谷歌chrome inspect，它应该包含在wikitable排序表中。我的评论可能不够清晰，我从请求中询问HTML是有原因的。您在浏览器中看到的内容通常与

请求的内容不同。我明白了。我怎样才能选择一个合适的请求？在使用它之前，有没有办法了解我需要从网站上获得什么？我的意思是，如果我需要抓取更多的网站来获取这些网站中使用的信息（例如收集所有图像），我应该自动这样做，还是需要逐个检查这些网站？
import pandas as pd
tables = pd.read_html("https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains")
tables[6] // is the table of the countries