Python 3.x 当所有需要的数据都不是文本格式时,如何刮取评论?,python-3.x,web-scraping,beautifulsoup,python-requests,Python 3.x,Web Scraping,Beautifulsoup,Python Requests,我在努力为大学研究搜集评论。我的代码打印出了我需要的大部分信息,但我还需要找到评级和用户ID 这是我的一些代码 import requests from bs4 import BeautifulSoup s = requests.Session() headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103



import requests
from bs4 import BeautifulSoup

s = requests.Session()

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36',
           'Referer': ""}

url = ''
r = s.get(url).content
page = s.get(url)
soup = BeautifulSoup(page.content, "lxml")

cj = s.cookies
requests.utils.dict_from_cookiejar(cj), headers=headers)

for i in soup('style'):
for s in soup('script'):
for t in soup('table'):
for ip in soup('input'):

important = soup.find("div", id='tn15content')




userID包含在每个a href元素中,如下所示

<a href="/user/ur0511587/">


<img width="102" height="12" alt="10/10" src="">


将找到所有具有a href以


import re
important = soup.find("div", id='tn15content')

for small in important.find_all("small", text=re.compile("review useful:")):
    div = small.parent
    user_id = div.select_one("a[href^=/user/ur]")["href"].split("ur")[1].rstrip("/")
    rating = div.select_one("img[alt*=/10]")
    print(user_id, rating["alt"] if rating else "N/A")

('0511587', '10/10')
('0209436', '9/10')
('1318093', 'N/A')
('0556711', '10/10')
('0075285', '9/10')
('0059151', '10/10')
('4445210', '9/10')
('0813687', 'N/A')
('0033913', '10/10')
('0819028', 'N/A')

('0511587', '10/10')
I happened to be flipping channels today and saw this was on.  Since it
been several years since I last saw it I clicked it on, but didn't mean to
stay.  As it happened, I found this film to be just as gripping now as it
was before.  My own kids started watching it, too, and enjoyed it - which
was even more satisfying for me considering the kind of current junk
used to.  No, this is not an action-packed thriller, nor are there juicy
love scenes between Abrahams and his actress girlfriend.  There is no
"colorful" language to speak of; no politically correct agenda underlying
its tale of a Cambridge Jew and Scottish Christian.This is a story about what drives people internally - what pushes them to
excel or at least to make the attempt to do so.  It is a story about
personal and societal values, loyalty, faith, desire to be accepted in
society and healthy competition without the utter selfishness that
characterizes so much of the athletic endeavors of our day.  Certainly the
characters are not alike in their motivation, but the end result is the
as far as their accomplishments.My early adolescent son (whose favorite movies are all of the Star Wars
movies and The Matrix) couldn't stop asking questions throughout the movie
he was so hooked.  It was a great educational opportunity as well as
entertainment.  If you've never seen this film or it's been a long time, I
recommend it unabashedly, regardless of the labels many have tried to give
it for being slow-paced or causing boredom.  In addition to the great
- based on real people and events - the photography and the music are
fabulous and moving.  It's no mistake that this movie has been spoofed and
otherwise stolen from in the last twenty years - it's an unforgettable
and in my opinion its bashers are those who hate Oscar winners on
or who don't like the philosophies espoused by its protagonists.

