Python 有没有不同的方法可以用熊猫来刮这个？_Python_Pandas

Python 有没有不同的方法可以用熊猫来刮这个？

python pandas

Python 有没有不同的方法可以用熊猫来刮这个？,python,pandas,Python,Pandas,所以我试着把桌子刮下来，我试着从上面弄到一张特别的桌子。我真的只知道如何使用pandas read_html函数刮表，所以我一直在这么做。当我使用长度函数时，熊猫告诉我只有5张桌子，而实际上只有14张。这是我想从中获取数据的图像，但熊猫认为这不存在。我使用的代码如下： import pandas as pd url = "https://www.basketball-reference.com/teams/BOS/1980.html" tables= pd.read_h

所以我试着把桌子刮下来，我试着从上面弄到一张特别的桌子。我真的只知道如何使用pandas read_html函数刮表，所以我一直在这么做。当我使用长度函数时，熊猫告诉我只有5张桌子，而实际上只有14张。这是我想从中获取数据的图像，但熊猫认为这不存在。我使用的代码如下：

import pandas as pd 

url = "https://www.basketball-reference.com/teams/BOS/1980.html"

tables= pd.read_html(url)

所以当我运行它时，我查看了所有的表，只得到了5个表。有人能帮忙吗？

关闭浏览器上的javascript，重新加载页面
不显示表格。查看源代码，您将看到表被注释掉了
可以使用BeautifulSoup
将此HTML推入
```
pd.read\u HTML（）
```

未命名：0 G 议员前景 FGA FG% 3P 3PA 3便士% 2P 2PA 2P% 英尺自由贸易协定英尺% 圆球 DRB TRB AST STL 黑色托夫 PF 临时秘书处 0 团队 82 19880 3617 7387 0.49 162 422 0.384 3455 6965 0.496 1907 2449 0.779 1227 2457 3684 2198 809 308 1539 1974 9303 1. 团队/G 楠 242.4 44.1 90.1 0.49 2. 5.1 0.384 42.1 84.9 0.496 23.3 29.9 0.779 15 30 44.9 26.8 9.9 3.8 18.8 24.1 113.5 2. Lg等级楠 4. 8. 14 7. 2. 2. 1. 15 17 7. 4. 6. 5. 13 10 11 8. 6. 21 11 13 5. 3. 年/年楠 1.0% 2.6% 0.5% 0.009 楠楠楠 -2.0% -5.2% 0.016 4.8% 5.5% -0.005 9.7% 2.5% 4.8% 10.2% 13.9% 8.8% -10.2% -0.2% 4.8% 4. 对手 82 19880 3439 7313 0.47 74 259 0.286 3365 7054 0.477 1712 2222 0.77 1168 2294 3462 1867 686 419 1635 2059 8664 5. 对手/G 楠 242.4 41.9 89.2 0.47 0.9 3.2 0.286 41 86 0.477 20.9 27.1 0.77 14.2 28 42.2 22.8 8.4 5.1 19.9 25.1 105.7 6. Lg等级楠 4. 6. 7. 8. 17 17 15 5. 7. 8. 11 10 17 6. 4. 2. 3. 2. 11 9 6. 6. 7. 年/年楠 1.0% -10.8% -3.7% -0.037 楠楠楠 -12.7% -7.1% -0.031 8.5% 6.9% 0.011 4.1% -6.5% -3.2% -14.0% -4.3% -4.3% 2.0% 1.7% -6.7%

关闭浏览器上的javascript，然后重新加载页面。您将看到该表不存在。它是使用javascript动态构建的，这意味着它必须在浏览器中构建，所以您需要使用类似selenium的东西来刮除它

import requests
from bs4 import BeautifulSoup
from bs4 import Comment
import pandas as pd
res = requests.get("https://www.basketball-reference.com/teams/BOS/1980.html")
id="div_team_and_opponent"
html = BeautifulSoup(res.content, 'html.parser')
pd.read_html(html.find_all(string=lambda text: isinstance(text, Comment) and id in text)[0])[0]