Python 使用beautifulsoup创建网页垃圾_Python_Python 3.x_Web Scraping_Imdb

Python 使用beautifulsoup创建网页垃圾

python python-3.x web-scraping

Python 使用beautifulsoup创建网页垃圾,python,python-3.x,web-scraping,imdb,Python,Python 3.x,Web Scraping,Imdb,我正在尝试使用python3.6从Imdb电影中获取评论。然而，当我打印我的“评论”时，只有一篇评论弹出，我不知道为什么其余的没有弹出。我的“评论标题”不会出现这种情况。任何建议或帮助都是非常感谢的，因为我一直在搜索论坛和谷歌搜索，但没有用 from urllib.request import urlopen from bs4 import BeautifulSoup import re url = urlopen('http://www.imdb.com/title/tt0111161/re

我正在尝试使用python3.6从Imdb电影中获取评论。然而，当我打印我的“评论”时，只有一篇评论弹出，我不知道为什么其余的没有弹出。我的“评论标题”不会出现这种情况。任何建议或帮助都是非常感谢的，因为我一直在搜索论坛和谷歌搜索，但没有用

from urllib.request import urlopen
from bs4 import BeautifulSoup
import re

url = urlopen('http://www.imdb.com/title/tt0111161/reviews?ref_=tt_ov_rt').read()

soup = BeautifulSoup(url,"html.parser")

print(soup.prettify())

review_title = soup.find("div",attrs={"class":"lister"}).findAll("div",{"class":"title"})
review = soup.find("div",attrs={"class":"text"})
review = soup.find("div",attrs={"class":"text"}).findAll("div",{"class":"text"})
rating = soup.find("span",attrs={"class":"rating-other-user-rating"}).findAll("span")

如果不创建任何循环，您如何访问该页面的所有内容？您编写脚本的方式正是它应该做的（解析单个审阅内容）。请尝试以下方式。它将获取所有可见的数据

from urllib.request import urlopen
from bs4 import BeautifulSoup

url = urlopen('http://www.imdb.com/title/tt0111161/reviews?ref_=tt_ov_rt').read()
soup = BeautifulSoup(url,"html.parser")
for item in soup.find_all(class_="review-container"):
    review_title = item.find(class_="title").text
    review = item.find(class_="text").text
    try:
        rating = item.find(class_="point-scale").previous_sibling.text
    except:
        rating = ""
    print("Title: {}\nReview: {}\nRating: {}\n".format(review_title,review,rating))

@拉克里·莫萨，你试过这个剧本吗？反馈是什么？很有效。谢谢我刚刚将打印切换为存储在DataFrame中。如果有效，请确保勾选我答案旁边的灰色复选标记？我做了，但它说我需要15个名声才能公开展示