python代码疑难解答(数据刮取)

python代码疑难解答(数据刮取),python,python-3.x,web-scraping,Python,Python 3.x,Web Scraping,我对python和数据抓取还不熟悉,所以我正在努力边学边用。我不明白为什么行“df=pd.read_html(url)[1]”在FOR循环中不起作用,但在FOR循环之外它没有问题。任何帮助都将不胜感激!提前谢谢 import pandas as pd team_abbrev = ['crd','atl','rav','buf'] year = 2020 list_of_dataframes = [] for team in team_abbrev: url = "pro-fo

我对python和数据抓取还不熟悉,所以我正在努力边学边用。我不明白为什么行“df=pd.read_html(url)[1]”在FOR循环中不起作用,但在FOR循环之外它没有问题。任何帮助都将不胜感激!提前谢谢

import pandas as pd
team_abbrev = ['crd','atl','rav','buf']
year = 2020
list_of_dataframes = []

for team in team_abbrev:
    url = "pro-football-reference.com/teams/" + team + "/" + str(year) + ".htm" 
    print(url)   
    df = pd.read_html(url)[1]

错误消息:

Traceback (most recent call last):
  File "c:\Users\14074\AppData\Local\Programs\Python\Python39\teams.py", line 12, in <module>
    df = pd.read_html(url)
  File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-
packages\pandas\util\_decorators.py", line 299, in wrapper
    return func(*args, **kwargs)
  File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line 
1085, in read_html      
    return _parse(
  File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line 
913, in _parse
    raise retained
  File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line 
893, in _parse
    tables = p.parse_tables()
  File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line 
213, in parse_tables    
    tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
  File "C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\io\html.py", line 543, in _parse_tables
    raise ValueError("No tables found")
ValueError: No tables found
回溯(最近一次呼叫最后一次):
文件“c:\Users\14074\AppData\Local\Programs\Python\Python39\teams.py”,第12行,在
df=pd.read\uHTML(url)
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site-
packages\pandas\util\\u decorators.py”,第299行,在包装器中
返回函数(*args,**kwargs)
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
1085,以只读html格式
返回解析(
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
913,在
提存
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
893,在
tables=p.parse_tables()
文件“C:\Users\14074\AppData\Local\Programs\Python\Python39\lib\site packages\pandas\io\html.py”,第行
213,在parse_表中
tables=self.\u parse\u tables(self.\u build\u doc(),self.match,self.attrs)
文件“C:\Users\14074\AppData\Local\Programs\Python\39\lib\site packages\pandas\io\html.py”,第543行,在解析表中
raise VALUERROR(“未找到表”)
ValueError:未找到任何表

您的url中似乎缺少https。 将第8行从

url = "pro-football-reference.com/teams/" + team + "/" + str(year) + ".htm" 


works

OMG-谢谢。这是一个多么愚蠢的错误。这起作用了。谢谢。我看不到一个密切的链接,因此,如果你想关闭这个,你可以或告诉我如何关闭它。谢谢,是的,我得到了我问题的答案。我尝试了投票,但它说我不能投票支持我的问题:(
url = "https://pro-football-reference.com/teams/" + team + "/" + str(year) + ".htm"