Python 使用BeautifulSoup和pandas将刮板盒ccores导出到Excel

Python 使用BeautifulSoup和pandas将刮板盒ccores导出到Excel,python,excel,pandas,beautifulsoup,Python,Excel,Pandas,Beautifulsoup,我一直在试图找出如何使用Python 3.6和BeautifulSoup和Pandas模块从方括号中刮取棒球箱分数。我的最终目标是在Excel中将网页的不同部分保存到不同的工作表中 为了做到这一点,我想我必须通过各自的id标签分别拉动每个表。这是为构成第一张excel工作表的四个表(页面上图表下方)执行此操作的代码。运行代码会导致以下错误: Traceback (most recent call last): File "Fangraphs Box Score Scraper.py", lin

我一直在试图找出如何使用Python 3.6和BeautifulSoup和Pandas模块从方括号中刮取棒球箱分数。我的最终目标是在Excel中将网页的不同部分保存到不同的工作表中

为了做到这一点,我想我必须通过各自的id标签分别拉动每个表。这是为构成第一张excel工作表的四个表(页面上图表下方)执行此操作的代码。运行代码会导致以下错误:

Traceback (most recent call last):

File "Fangraphs Box Score Scraper.py", line 14, in <module>
df1 = pd.read_html(soup,attrs={'id': ['WinsBox1_dghb','WinsBox1_dghp','WinsBox1_dgab','WinsBox1_dgap']})

File "C:\Python36\lib\site-packages\pandas\io\html.py", line 906, in read_html
keep_default_na=keep_default_na)

File "C:\Python36\lib\site-packages\pandas\io\html.py", line 743, in _parse
raise_with_traceback(retained)

File "C:\Python36\lib\site-packages\pandas\compat\__init__.py", line 344, in raise_with_traceback

raise exc.with_traceback(traceback)

TypeError: 'NoneType' object is not callable

您使用了错误的
id
,您使用了
形式,但需要使用
标记,我认为您不需要使用bs,请尝试:

import pandas as pd

url = 'http://www.fangraphs.com/boxscore.aspx?date=2017-09-10&team=Red%20Sox&dh=0&season=2017'
df1 = pd.read_html(
    url,
    attrs={'id': ['WinsBox1_dghb_ctl00', 'WinsBox1_dgab_ctl00']}
)

# and now df1 it is list of df
writer = pd.ExcelWriter('Box Scores.xlsx')
row = 0
for df in df1:
    df.to_excel(writer, sheet_name='tables', startrow=row , startcol=0)   
    row = row + len(df.index) + 3

writer.save()

请添加完整的错误代码,对此表示抱歉。很高兴能帮助你,请不要忘记接受答案
import pandas as pd

url = 'http://www.fangraphs.com/boxscore.aspx?date=2017-09-10&team=Red%20Sox&dh=0&season=2017'
df1 = pd.read_html(
    url,
    attrs={'id': ['WinsBox1_dghb_ctl00', 'WinsBox1_dgab_ctl00']}
)

# and now df1 it is list of df
writer = pd.ExcelWriter('Box Scores.xlsx')
row = 0
for df in df1:
    df.to_excel(writer, sheet_name='tables', startrow=row , startcol=0)   
    row = row + len(df.index) + 3

writer.save()