python代码疑难解答(数据刮取)
我对python和数据抓取还不熟悉,所以我正在努力边学边用。我不明白为什么行“df=pd.read_html(url)[1]”在FOR循环中不起作用,但在FOR循环之外它没有问题。任何帮助都将不胜感激!提前谢谢python代码疑难解答(数据刮取),python,python-3.x,web-scraping,Python,Python 3.x,Web Scraping,我对python和数据抓取还不熟悉,所以我正在努力边学边用。我不明白为什么行“df=pd.read_html(url)[1]”在FOR循环中不起作用,但在FOR循环之外它没有问题。任何帮助都将不胜感激!提前谢谢 import pandas as pd team_abbrev = ['crd','atl','rav','buf'] year = 2020 list_of_dataframes = [] for team in team_abbrev: url = "pro-fo
import pandas as pd
team_abbrev = ['crd','atl','rav','buf']
year = 2020
list_of_dataframes = []
for team in team_abbrev:
url = "pro-football-reference.com/teams/" + team + "/" + str(year) + ".htm"
print(url)
df = pd.read_html(url)[1]
错误消息:
Traceback (most recent call last):
File "c:\Users\14074\AppData\Local\Programs\Python\Python39\teams.py", line 12, in <module>
df = pd.read_html(url)
File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-
packages\pandas\util\_decorators.py", line 299, in wrapper
return func(*args, **kwargs)
File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line
1085, in read_html
return _parse(
File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line
913, in _parse
raise retained
File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line
893, in _parse
tables = p.parse_tables()
File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line
213, in parse_tables
tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line 543, in _parse_tables
raise ValueError("No tables found")
ValueError: No tables found
回溯(最近一次呼叫最后一次):
文件“c:\Users\14074\AppData\Local\Programs\Python\Python39\teams.py”,第12行,在
df=pd.read\uHTML(url)
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-
packages\pandas\util\\u decorators.py”,第299行,在包装器中
返回函数(*args,**kwargs)
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
1085,以只读html格式
返回解析(
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
913,在
提存
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
893,在
tables=p.parse_tables()
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
213,在parse_表中
tables=self.\u parse\u tables(self.\u build\u doc(),self.match,self.attrs)
文件“C:\Users\14074\AppData\Local\Programs\Python\39\lib\site packages\pandas\io\html.py”,第543行,在解析表中
raise VALUERROR(“未找到表”)
ValueError:未找到任何表
您的url中似乎缺少https。
将第8行从
url = "pro-football-reference.com/teams/" + team + "/" + str(year) + ".htm"
到
worksOMG-谢谢。这是一个多么愚蠢的错误。这起作用了。谢谢。我看不到一个密切的链接,因此,如果你想关闭这个,你可以或告诉我如何关闭它。谢谢,是的,我得到了我问题的答案。我尝试了投票,但它说我不能投票支持我的问题:(
url = "https://pro-football-reference.com/teams/" + team + "/" + str(year) + ".htm"