Python 将findAll与BeautifulSoup一起使用时，返回一个空列表_Python_Html_Python 3.x_Web Scraping_Beautifulsoup

Python 将findAll与BeautifulSoup一起使用时，返回一个空列表

python html python-3.x web-scraping

Python 将findAll与BeautifulSoup一起使用时，返回一个空列表,python,html,python-3.x,web-scraping,beautifulsoup,Python,Html,Python 3.x,Web Scraping,Beautifulsoup,我正在练习一些网页抓取，对于这个项目，我正在抓取这个网站：我获取并解析如下HTML代码 my_url = 'https://assetdash.com/?all=true' client = urlopen(my_url) page_html = client.read() client.close() soup = BeautifulSoup(page_html, 'html.parser') rows = soup.findAll("tr", {"class

我正在练习一些网页抓取，对于这个项目，我正在抓取这个网站：

我获取并解析如下HTML代码

my_url = 'https://assetdash.com/?all=true'
client = urlopen(my_url)
page_html = client.read()
client.close()

soup = BeautifulSoup(page_html, 'html.parser')
rows = soup.findAll("tr", {"class":"Table__Tr-sc-1pfmqa-5 gNrtPb"})
print(len(rows))

它返回的长度为0，而它应该返回更高的值。我是否在解析过程中出错，或者检索到的行不正确？

它是动态的，并且是javascript呈现的。直接转到数据源

代码：

import requests
my_url = 'https://assetdash.herokuapp.com/assets?currentPage=1&perPage=200&typesOfAssets[]=Stock&typesOfAssets[]=ETF&typesOfAssets[]=Cryptocurrency'

data = requests.get(my_url).json()
df = pd.DataFrame(data['data'])

print (df)
       id ticker  ... peRatio  rank
0      60   AAPL  ...   35.17     1
1    2287   MSFT  ...   34.18     2
2     251   AMZN  ...   91.52     3
3    1527  GOOGL  ...   33.79     4
4    1276     FB  ...   31.09     5
..    ...    ...  ...     ...   ...
195   537  BMWYY  ...   15.06   196
196  3756    WBK  ...   35.57   197
197  1010     DG  ...   23.40   198
198  1711    HUM  ...   12.77   199
199  1194   EQNR  ...  -15.82   200

[200 rows x 13 columns]

输出：

import requests
my_url = 'https://assetdash.herokuapp.com/assets?currentPage=1&perPage=200&typesOfAssets[]=Stock&typesOfAssets[]=ETF&typesOfAssets[]=Cryptocurrency'

data = requests.get(my_url).json()
df = pd.DataFrame(data['data'])

print (df)
       id ticker  ... peRatio  rank
0      60   AAPL  ...   35.17     1
1    2287   MSFT  ...   34.18     2
2     251   AMZN  ...   91.52     3
3    1527  GOOGL  ...   33.79     4
4    1276     FB  ...   31.09     5
..    ...    ...  ...     ...   ...
195   537  BMWYY  ...   15.06   196
196  3756    WBK  ...   35.57   197
197  1010     DG  ...   23.40   198
198  1711    HUM  ...   12.77   199
199  1194   EQNR  ...  -15.82   200

[200 rows x 13 columns]

当然它会返回一个空列表，因为整个站点都在

JS

后面，这意味着

BeautifulSoup

什么也看不到。检查是否有可以查询或浏览的API

selenium

@baduker哦，好的，谢谢！非常感谢。这很好地工作了。顺便说一句，你是如何得到包含JSON文件的URL的；网络->我明白了！谢谢