Python 美丽的汤选择器返回一个空列表_Python_Html_Python 3.x_Beautifulsoup_Python Requests

Python 美丽的汤选择器返回一个空列表

python html python-3.x

Python 美丽的汤选择器返回一个空列表,python,html,python-3.x,beautifulsoup,python-requests,Python,Html,Python 3.x,Beautifulsoup,Python Requests,因此，我正在进行“自动化无聊的东西”课程，并试图为《自动化无聊的东西》一书刮取亚马逊的价格，但无论发生什么，它都返回一个空字符串，结果在elems[0]处出现索引错误。text.strip（）我不知道该怎么办 def getAmazonPrice（产品URL）： headers={'user-agent'：'Mozilla/5.0（Windows NT 10.0；Win64；x64；rv:69.0）Gecko/20100101 Firefox/69.0'}使服务器认为它是一个web浏览器而不是一

因此，我正在进行“自动化无聊的东西”课程，并试图为《自动化无聊的东西》一书刮取亚马逊的价格，但无论发生什么，它都返回一个空字符串，结果在elems[0]处出现索引错误。text.strip（）我不知道该怎么办

def getAmazonPrice（产品URL）：
headers={'user-agent'：'Mozilla/5.0（Windows NT 10.0；Win64；x64；rv:69.0）Gecko/20100101 Firefox/69.0'}使服务器认为它是一个web浏览器而不是一个机器人
res=requests.get（productUrl，headers=headers）
res.为_状态提高_（）
soup=bs4.BeautifulSoup（res.text，'html.parser'）
elems=soup。选择（“#mediaNoAccordion>div.a-row>div.a-column.a-span4.a-text-right.a-span-last”）
返回元素[0]。text.strip（）
价格=getAmazonPrice（'https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+使用+python&qid=1586810720&sprefix=automatic+bo%2Caps%2C288&sr=8-1'）
打印（'价格为'+价格）

您的请求将触发来自Amazon的503错误。也许是因为亚马逊的反刮削努力。也许你应该考虑其他的方法。< /P>

import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'} # to make the server think its a web browser and not a bot

productUrl = 'https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1'

res = requests.get(productUrl, headers=headers)

print (res)

输出：

<Response [503]>

您的请求将触发来自Amazon的503错误。也许是因为亚马逊的反刮削努力。也许你应该考虑其他的方法。< /P>

import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'} # to make the server think its a web browser and not a bot

productUrl = 'https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1'

res = requests.get(productUrl, headers=headers)

print (res)

输出：

<Response [503]>

您需要将解析器更改为

lxml

并使用

headers={'user-agent'：'Mozilla/5.0'}

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select_one('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems.text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

快照：

如果要使用“选择”，则

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

试试这个

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'}  # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

您需要将解析器更改为

lxml

并使用

headers={'user-agent'：'Mozilla/5.0'}

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select_one('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems.text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

快照：

如果要使用“选择”，则

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

试试这个

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'}  # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

你能给我举一个我可以浏览的网站的例子吗？既然我被困在这一点上，在我继续课程之前，我需要一些有用的东西。你能给我举一个我可以浏览的网站的例子吗？由于我在这一点上遇到了困难，在继续课程之前，我需要一些有用的东西。因此，下载的html中包含的文本会说：“对不起，我们只需要确保您不是机器人。为了获得最佳效果，请确保您的浏览器正在接受cookies。”，这里给您一个提示。下载的html中包含的文本也会这样说“对不起，我们只需要确定你不是机器人。为了获得最佳效果，请确保您的浏览器正在接受cookies。“，这里有一个提示。我得到以下错误：bs4.FeatureNotFound:找不到具有您请求的功能的树生成器：lxml。是否需要安装解析器库？然后我尝试安装lxml模块并导入它，现在我得到返回elems.text.strip（）AttributeError:'NoneType'对象没有属性'text'@MahdeenSky:您需要使用

pip install lxml安装它，但随后我收到一个新错误：AttributeError:'NoneType'对象没有属性'text'@Buddy检查我的代码：它是select_one（）
Notselect（）
我复制粘贴了准确的代码，但它不起作用我仍然收到属性错误我收到错误：bs4.FeatureNotFound:找不到具有您请求的功能的树生成器：lxml。是否需要安装解析器库？然后我尝试安装lxml模块并导入它，现在我得到返回elems.text.strip（）AttributeError:'NoneType'对象没有属性'text'@MahdeenSky:您需要使用pip install lxml安装它，但随后我收到一个新错误：AttributeError:'NoneType'对象没有属性'text'@Buddy检查我的代码：它是select_one（）
Notselect（）
我复制粘贴了准确的代码，但它不起作用，我仍然得到属性错误