Python 3.x 如何使用BeautifulSoup从wikipedia中提取表格_Python 3.x_Web Scraping_Beautifulsoup_Urllib

Python 3.x 如何使用BeautifulSoup从wikipedia中提取表格

python-3.x web-scraping

Python 3.x 如何使用BeautifulSoup从wikipedia中提取表格,python-3.x,web-scraping,beautifulsoup,urllib,Python 3.x,Web Scraping,Beautifulsoup,Urllib,我正在尝试编写一个从维基百科页面中提取表格的刮板。问题是，我可以提取页面上的所有表格，但我实际需要的表格除外（该表格包含美国有史以来所有选举的统计数据）。我认为问题不在于我的标签。这是我的密码 from urllib.error import HTTPError from urllib.error import URLError from bs4 import BeautifulSoup from urllib.request import urlopen #getting the wiki

我正在尝试编写一个从维基百科页面中提取表格的刮板。问题是，我可以提取页面上的所有表格，但我实际需要的表格除外（该表格包含美国有史以来所有选举的统计数据）。我认为问题不在于我的标签。
这是我的密码

from urllib.error import HTTPError
from urllib.error import URLError
from bs4 import BeautifulSoup
from urllib.request import urlopen

#getting the wiki page
page_info=urlopen('https://en.wikipedia.org/wiki/United_States_presidential_election')

soup=BeautifulSoup(page_info, 'html.parser')

headline=soup.find('table', "wikitable sortable jquery-tablesorter")
print(headline)

我认为我遗漏了一些重要的东西，但我不能对此掉以轻心。有人能帮我吗。

一种方法是：

import pandas as pd
import requests
from bs4 import BeautifulSoup


page = requests.get('https://en.wikipedia.org/wiki/United_States_presidential_election').text
soup = BeautifulSoup(page, 'html.parser')
table = soup.find('table', class_="wikitable sortable")

df = pd.read_html(str(table))
df = pd.concat(df)
print(df)
df.to_csv("elections.csv", index=False)

哪些产出：

     Year                                    Party  ... Electoral votes      Notes
0    1788                              Independent  ...        69 / 138        NaN
1    1788                               Federalist  ...        34 / 138        NaN
2    1788                               Federalist  ...         9 / 138        NaN
3    1788                               Federalist  ...         6 / 138        NaN
4    1788                               Federalist  ...         6 / 138        NaN
..    ...                                      ...  ...             ...        ...
[219 rows x 8 columns]

或者是一个如下所示的

.csv

文件：

注意：无论何时进行抓取，请始终关闭

JS

（JavaScript）<代码>美化组看不到动态呈现的内容。这就是你没有得到任何回报的方式，因为没有

JS

你想要的标签的类别是不同的