Python web scraping无法在网页中找到所有标记_Python_Web Scraping

Python web scraping无法在网页中找到所有标记

python web-scraping

Python web scraping无法在网页中找到所有标记,python,web-scraping,Python,Web Scraping,我正试图抓取一个特定的网页，但我找不到其中所有的段落标签我已经完成了下面的问题，但这似乎并不能解决问题这是一个不断刷新的动态网页，如果我单击页面底部的“加载更多评论”按钮，它将加载其他内容代码： from bs4 import BeautifulSoup import requests r = requests.get("http://www.cricbuzz.com/live-cricket-scores/18127") data = r.text soup = Beautiful

我正试图抓取一个特定的网页，但我找不到其中所有的段落标签

我已经完成了下面的问题

，但这似乎并不能解决问题

这是一个不断刷新的动态网页，如果我单击页面底部的“加载更多评论”按钮，它将加载其他内容

代码：

from bs4 import BeautifulSoup
import requests

r = requests.get("http://www.cricbuzz.com/live-cricket-scores/18127")
data = r.text

soup = BeautifulSoup(data)
p = soup.find_all('p')

len(p)

十,

博尔特对四岁的哈迪克·潘迪亚说，这很可能是让KKR输掉比赛的失误。这本不应该超过一次。哈迪克找不到任何仰角的低抛全投。他把球打到了朗安，苏里亚很好地把球打到了朗安，但他打错了场，球就溜了过去

无论如何，我都能从这个网页上抓取所有的评论数据吗

你得到的是段落p[9]（p-tag）我想，你需要把打印语句放在一个循环中才能打印所有段落。大概是这样的：

body = soup.body
for p in body.find_all('p')
    print(p.text)

要获取所有评论，您可以使用站点API:

http://push.cricbuzz.com/match-api/18127/commentary-full.json

。它以json格式返回所有数据，您可以轻松解析和提取所需内容：

import requests

r = requests.get('http://push.cricbuzz.com/match-api/18127/commentary-full.json').json() 

all_comments = r['comm_lines']

# print first 10 comments
for comment in all_comments[:10]:
    if 'comm' in comment:
        print(comment['comm'])

requests.get（）

不检索动态内容。你应该使用其他的刮削方法，例如硒。我知道如果我要打印所有段落，我需要使用循环。但是find_all本身只返回了10个标记，而页面中还有很多标记。@AdarshHV我修改了代码，只提供了p标记中的文本，以及关于为什么只有这10个标记是页面上唯一活动的p标记的问题。

import requests

r = requests.get('http://push.cricbuzz.com/match-api/18127/commentary-full.json').json() 

all_comments = r['comm_lines']

# print first 10 comments
for comment in all_comments[:10]:
    if 'comm' in comment:
        print(comment['comm'])