Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/276.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 用BeautifulSoup抓取站点:尝试查找具有特定id的div元素将返回None_Python_Web_Web Scraping_Beautifulsoup - Fatal编程技术网

Python 用BeautifulSoup抓取站点:尝试查找具有特定id的div元素将返回None

Python 用BeautifulSoup抓取站点:尝试查找具有特定id的div元素将返回None,python,web,web-scraping,beautifulsoup,Python,Web,Web Scraping,Beautifulsoup,尝试以下操作以检索所有书籍标题: from bs4 import BeautifulSoup from urllib import request import csv # adding a correct user agent headers = { 'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538

尝试以下操作以检索所有书籍标题:

from bs4 import BeautifulSoup
from urllib import request
import csv

# adding a correct user agent
headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'}
#The url to be scraped
company_page = 'https://www.goodreads.com/list/show/6.Best_Books_of_the_20th_Century?'

#opening the page
page_request = request.Request(company_page, headers=headers)
page = request.urlopen(page_request)

#parse the html using beautiful soup
html_content = BeautifulSoup(page, 'html.parser')

#Parsing some of the title elements
title = html_content.find('div',id='shell')
print(title)
输出:

from bs4 import BeautifulSoup
import requests
import csv

headers = {
    'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'}
company_page = 'http://www.goodreads.com/list/show/6.Best_Books_of_the_20th_Century?'

page = requests.get(company_page, headers=headers)
soup = BeautifulSoup(page.content, 'html.parser')

titles = soup.find_all('a', class_="bookTitle")
for title in titles:
    print(title.text)

详细信息:请注意,我已将
urllib
更改为
请求

您通过
id=“shell”
引用了哪些信息?如果您的问题已解决,请将答案标记为已接受,以便其他人可以看到您的问题已得到回答。
To Kill a Mockingbird
1984
Harry Potter and the Sorcerer's Stone (Harry Potter, #1)
The Great Gatsby
Animal Farm
The Hobbit, or There and Back Again
The Diary of a Young Girl
The Little Prince
Fahrenheit 451
The Catcher in the Rye
The Lion, the Witch and the Wardrobe (Chronicles of Narnia, #1)
The Grapes of Wrath
One Hundred Years of Solitude
...