Python 美丽的汤选择器返回一个空列表

Python 美丽的汤选择器返回一个空列表,python,html,python-3.x,beautifulsoup,python-requests,Python,Html,Python 3.x,Beautifulsoup,Python Requests,因此,我正在进行“自动化无聊的东西”课程,并试图为《自动化无聊的东西》一书刮取亚马逊的价格,但无论发生什么,它都返回一个空字符串,结果在elems[0]处出现索引错误。text.strip()我不知道该怎么办 def getAmazonPrice(产品URL): headers={'user-agent':'Mozilla/5.0(Windows NT 10.0;Win64;x64;rv:69.0)Gecko/20100101 Firefox/69.0'}使服务器认为它是一个web浏览器而不是一

因此,我正在进行“自动化无聊的东西”课程,并试图为《自动化无聊的东西》一书刮取亚马逊的价格,但无论发生什么,它都返回一个空字符串,结果在elems[0]处出现索引错误。text.strip()我不知道该怎么办

def getAmazonPrice(产品URL):
headers={'user-agent':'Mozilla/5.0(Windows NT 10.0;Win64;x64;rv:69.0)Gecko/20100101 Firefox/69.0'}使服务器认为它是一个web浏览器而不是一个机器人
res=requests.get(productUrl,headers=headers)
res.为_状态提高_()
soup=bs4.BeautifulSoup(res.text,'html.parser')
elems=soup。选择(“#mediaNoAccordion>div.a-row>div.a-column.a-span4.a-text-right.a-span-last”)
返回元素[0]。text.strip()
价格=getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+使用+python&qid=1586810720&sprefix=automatic+bo%2Caps%2C288&sr=8-1')
打印('价格为'+价格)

您的请求将触发来自Amazon的503错误。也许是因为亚马逊的反刮削努力。也许你应该考虑其他的方法。< /P>
import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'} # to make the server think its a web browser and not a bot

productUrl = 'https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1'

res = requests.get(productUrl, headers=headers)

print (res)
输出:

<Response [503]>

您的请求将触发来自Amazon的503错误。也许是因为亚马逊的反刮削努力。也许你应该考虑其他的方法。< /P>
import requests

headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'} # to make the server think its a web browser and not a bot

productUrl = 'https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1'

res = requests.get(productUrl, headers=headers)

print (res)
输出:

<Response [503]>

您需要将解析器更改为
lxml
并使用
headers={'user-agent':'Mozilla/5.0'}

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select_one('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems.text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)
快照


如果要使用“选择”,则

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

试试这个

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'}  # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

您需要将解析器更改为
lxml
并使用
headers={'user-agent':'Mozilla/5.0'}

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select_one('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems.text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)
快照


如果要使用“选择”,则

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0'} # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

试试这个

def getAmazonPrice(productUrl):
    headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:69.0) Gecko/20100101 Firefox/69.0'}  # to make the server think its a web browser and not a bot
    res = requests.get(productUrl, headers=headers)
    res.raise_for_status()


    soup = bs4.BeautifulSoup(res.text, 'lxml')
    elems = soup.select('#mediaNoAccordion > div.a-row > div.a-column.a-span4.a-text-right.a-span-last')
    return elems[0].text.strip()


price = getAmazonPrice('https://www.amazon.com/Automate-Boring-Stuff-Python-2nd-ebook/dp/B07VSXS4NK/ref=sr_1_1?crid=30NW5VCV06ZMP&dchild=1&keywords=automate+the+boring+stuff+with+python&qid=1586810720&sprefix=automate+the+bo%2Caps%2C288&sr=8-1')
print('The price is ' + price)

你能给我举一个我可以浏览的网站的例子吗?既然我被困在这一点上,在我继续课程之前,我需要一些有用的东西。你能给我举一个我可以浏览的网站的例子吗?由于我在这一点上遇到了困难,在继续课程之前,我需要一些有用的东西。因此,下载的html中包含的文本会说:“对不起,我们只需要确保您不是机器人。为了获得最佳效果,请确保您的浏览器正在接受cookies。”,这里给您一个提示。下载的html中包含的文本也会这样说“对不起,我们只需要确定你不是机器人。为了获得最佳效果,请确保您的浏览器正在接受cookies。“,这里有一个提示。我得到以下错误:bs4.FeatureNotFound:找不到具有您请求的功能的树生成器:lxml。是否需要安装解析器库?然后我尝试安装lxml模块并导入它,现在我得到返回elems.text.strip()AttributeError:'NoneType'对象没有属性'text'@MahdeenSky:您需要使用
pip install lxml安装它,但随后我收到一个新错误:AttributeError:'NoneType'对象没有属性'text'@Buddy检查我的代码:它是
select_one()
Not
select()
我复制粘贴了准确的代码,但它不起作用我仍然收到属性错误我收到错误:bs4.FeatureNotFound:找不到具有您请求的功能的树生成器:lxml。是否需要安装解析器库?然后我尝试安装lxml模块并导入它,现在我得到返回elems.text.strip()AttributeError:'NoneType'对象没有属性'text'@MahdeenSky:您需要使用
pip install lxml安装它,但随后我收到一个新错误:AttributeError:'NoneType'对象没有属性'text'@Buddy检查我的代码:它是
select_one()
Not
select()
我复制粘贴了准确的代码,但它不起作用,我仍然得到属性错误