Python 正在尝试按表号进行屏幕清理
我拼凑了一些非常简单的代码,我想这应该可以让我从一个URL中截取各种表格。这是我的密码Python 正在尝试按表号进行屏幕清理,python,python-3.x,beautifulsoup,Python,Python 3.x,Beautifulsoup,我拼凑了一些非常简单的代码,我想这应该可以让我从一个URL中截取各种表格。这是我的密码 import pandas as pd import requests from bs4 import BeautifulSoup res = requests.get("https://www.federalreserve.gov/releases/h8/current/default.htm") soup = BeautifulSoup(res.content,'lxml') tab
import pandas as pd
import requests
from bs4 import BeautifulSoup
res = requests.get("https://www.federalreserve.gov/releases/h8/current/default.htm")
soup = BeautifulSoup(res.content,'lxml')
table = soup.find_all('table')[1]
df = pd.read_html(str(table))
print(df)
我试图从表1中获取数据,但在运行代码时出现以下错误:
IndexError: list index out of range
我想输入一个表号并从该特定表中获取相关数据。页面上共有11个表格。我做错了什么?我相信您只是想要一个标准的html解析器,而不是使用lxml 更改:
soup = BeautifulSoup(res.content,'lxml')
致:
输出:
[ Account 2019 Aug 2020 Feb 2020 Mar 2020 Apr 2020 May 2020 Jun 2020 Jul 2020 Aug Week ending
Account Account.1 2019 Aug 2020 Feb 2020 Mar 2020 Apr 2020 May 2020 Jun 2020 Jul 2020 Aug Aug 12 Aug 19 Aug 26 Sep 02
0 Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets
1 1 Bank credit 13606.2 13949.9 14419.2 14787.2 14851.9 14840.1 14880.2 14889.9 14878.1 14873.9 14893.1 14912.1
2 2 Securities in bank credit 2 3725.5 3871.0 3987.4 3989.3 4016.8 4166.5 4297.2 4363.6 4327.7 4357.0 4375.1 4421.3
..我找到了一个有效的解决方案!
[ Account 2019 Aug 2020 Feb 2020 Mar 2020 Apr 2020 May 2020 Jun 2020 Jul 2020 Aug Week ending
Account Account.1 2019 Aug 2020 Feb 2020 Mar 2020 Apr 2020 May 2020 Jun 2020 Jul 2020 Aug Aug 12 Aug 19 Aug 26 Sep 02
0 Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets Assets
1 1 Bank credit 13606.2 13949.9 14419.2 14787.2 14851.9 14840.1 14880.2 14889.9 14878.1 14873.9 14893.1 14912.1
2 2 Securities in bank credit 2 3725.5 3871.0 3987.4 3989.3 4016.8 4166.5 4297.2 4363.6 4327.7 4357.0 4375.1 4421.3