Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/xpath/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python xpath如何打印多个元素_Python_Xpath_Beautifulsoup - Fatal编程技术网

Python xpath如何打印多个元素

Python xpath如何打印多个元素,python,xpath,beautifulsoup,Python,Xpath,Beautifulsoup,我正在尝试使用HTMLSession和xpath在Amazon的第一个产品页面上刮取产品标题 from requests_html import HTMLSession from bs4 import BeautifulSoup def getTitle(url): session = HTMLSession() r = session.get(url) r.html.render(sleep=1) product = { 'titl

我正在尝试使用HTMLSession和xpath在Amazon的第一个产品页面上刮取产品标题

from requests_html import HTMLSession
from bs4 import BeautifulSoup

def getTitle(url):
    session = HTMLSession()
    r = session.get(url)
    r.html.render(sleep=1)

    
    product = {
        'title': r.html.xpath('//*[@class="a-size-medium a-color-base a-text-normal"]').text
    }

    print(product)
    return product


getTitle('https://www.amazon.com/s?k=amazon+echo+dot&qid=1605730376&ref=sr_pg_1')

>{'title': 'Echo Dot (3rd Gen) - Smart speaker with Alexa - Charcoal'}
产品标题具有class=“a-size-medium a-color-base a-text-normal”属性,因此我希望刮除显示在同一页面上的所有产品标题,但代码仅输出其中一个

对于ex,我想要的是:

{'title': 'Echo dot 1st gen...'}
{'title': 'Echo dot for kids...'}
{'title': 'Amazon Echo dot 3rd gen...'}
有什么建议或解决办法吗


谢谢

为什么要用XPath来处理这样简单的事情

[x.text for x in soup.find_all(class_="a-size-medium a-color-base a-text-normal")]
有一件事是,这本词典不允许重复键,因此您不能在词典中有多个
标题
。但您可以喜欢
标题1
标题2

{'title'+str(x):y.text for x,y in enumerate(soup.find_all(class_="a-size-medium a-color-base a-text-normal"))}

对函数进行了一些修改,以便将标题收集到
产品
词典列表中(顺便说一句,您并不真正需要这些词典)。您也不需要执行此操作

def getTitle(url):
    session = HTMLSession()
    r = session.get(url)
    r.html.render(sleep=1)

    product=[{'title':item.text} for item in r.html.xpath('//*[@class="a-size-medium a-color-base a-text-normal"]')]
    return product


results=getTitle('https://www.amazon.com/s?k=amazon+echo+dot&qid=1605730376&ref=sr_pg_1')
产品
行替换为以下内容,以获取标题(字符串)列表,而不是包含标题键和值的词典

product=[item.text for item in r.html.xpath('//*[@class="a-size-medium a-color-base a-text-normal"]')]

我很难与来源争论,它太大了,IPython正在崩溃:-(