不确定为什么xpath查询返回为空_Xpath_Scrapy

不确定为什么xpath查询返回为空

xpath scrapy

不确定为什么xpath查询返回为空,xpath,scrapy,Xpath,Scrapy,我是scrapy的新手，正在试用shell，试图从以下URL检索产品：以下是我的代码-我不确定最终查询返回为空的原因： $ scrapy shell fetch("https://www.newbalance.co.nz/men/shoes/running/trail/?prefn1=sizeRefinement&prefv1=men_shoes_7") div_product_lists = response.xpath('//div[@id="product-lists"]')

我是scrapy的新手，正在试用shell，试图从以下URL检索产品：

以下是我的代码-我不确定最终查询返回为空的原因：

$ scrapy shell

fetch("https://www.newbalance.co.nz/men/shoes/running/trail/?prefn1=sizeRefinement&prefv1=men_shoes_7")
div_product_lists = response.xpath('//div[@id="product-lists"]')
ul_product_list_main = div_product_lists.xpath('//ul[@id="product-list-main"]')
for li_tile in ul_product_list_main.xpath('//li[@class="tile"]'):
...    print li_tile.xpath('//div[@class="product"]').extract()
...
[]
[]

如果我使用属性检查器检查页面，那么我看到的是div的数据（使用class

product

），所以我不确定为什么返回为空。任何帮助都将不胜感激

您要提取的数据在其他具有类

产品顶部间隔垫圈的部门中更容易获得
例如，您可以通过以下方式获得具有class=“product top spacer”
的所有div：
ts = response.xpath('//div[@class="product-top-spacer"]')

并检查提取的第一个div的项目及其价格：
ts[0].xpath('descendant::p[@class="product-name"]/a/text()').extract()[0]
>> 'Leadville v3'

ts[0].xpath('descendant::div[@class="product-pricing"]/text()').extract()[0].strip()
>> '$260.00'

通过迭代ts

for t in ts:
    itname = t.xpath('descendant::p[@class="product-name"]/a/text()').extract()[0]
    itprice = t.xpath('descendant::div[@class="product-pricing"]/text()').extract()[0].strip()
    itprice = ' '.join(itprice.split()) # some cleaning
    print(itname + ", " + itprice)

您要提取的数据在具有classproduct top spacer
的其他div中更容易获得
例如，您可以通过以下方式获得具有class=“product top spacer”
的所有div：
ts = response.xpath('//div[@class="product-top-spacer"]')

并检查提取的第一个div的项目及其价格：
ts[0].xpath('descendant::p[@class="product-name"]/a/text()').extract()[0]
>> 'Leadville v3'

ts[0].xpath('descendant::div[@class="product-pricing"]/text()').extract()[0].strip()
>> '$260.00'

通过迭代ts

for t in ts:
    itname = t.xpath('descendant::p[@class="product-name"]/a/text()').extract()[0]
    itprice = t.xpath('descendant::div[@class="product-pricing"]/text()').extract()[0].strip()
    itprice = ' '.join(itprice.split()) # some cleaning
    print(itname + ", " + itprice)

这里的问题是xpath不理解class=“product-product-tile”
的意思是“这个元素有两个类，产品和产品tile”。

在xpath选择器中，class属性与其他属性一样，只是一个字符串
知道了这一点，您可以搜索整个类字符串：
>>> li_tile.xpath('.//div[@class="product product-tile "]')
[<Selector xpath='.//div[@class="product product-tile "]' data='<div class="product product-tile " tabin'>]

>>li_tile.xpath（'.//div[@class=“product tile”]”）
[]

如果要查找具有“product”类的所有元素，最简单的方法是使用css选择器：
>>> li_tile.css('div.product')
[<Selector xpath="descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' product ')]" data='<div class="product product-tile " tabin'>]

>>li_tile.css（'div.product'）
[]

通过查看生成的选择器
，您可以看到，仅使用xpath实现这一点要复杂一些。
这里的问题是xpath不理解class=“product-product-tile”
的意思是“此元素有两个类，产品和产品tile”。

在xpath选择器中，class属性与其他属性一样，只是一个字符串
知道了这一点，您可以搜索整个类字符串：
>>> li_tile.xpath('.//div[@class="product product-tile "]')
[<Selector xpath='.//div[@class="product product-tile "]' data='<div class="product product-tile " tabin'>]

>>li_tile.xpath（'.//div[@class=“product tile”]”）
[]

如果要查找具有“product”类的所有元素，最简单的方法是使用css选择器：
>>> li_tile.css('div.product')
[<Selector xpath="descendant-or-self::div[@class and contains(concat(' ', normalize-space(@class), ' '), ' product ')]" data='<div class="product product-tile " tabin'>]

>>li_tile.css（'div.product'）
[]

通过查看生成的选择器
，您可以看到，仅使用xpath实现这一点要复杂一些。
您希望从突出显示的div中提取哪些属性？简单地说，我试图获取项目名称和价格（例如，在示例URL中，“590v3 Trail”和第一个项目的120.00美元）谢谢你，丹尼尔。但是如果我将最后一行替换为“//div”（而不是试图将其限制为“product”），那么我将得到结果-因此，到目前为止，事情一定在进行中，例如，对于ul_product_list_main.xpath（“//li[@class=“tile”]”）中的li_tile（“//li[@class=“tile”]）：。。。打印li_tile.xpath（'//div'）。。。[，P.S.对格式设置表示歉意-但是在评论中找不到更好的方法来实现这一点。您想从突出显示的div中提取哪些属性？用简单的英语，我正在尝试获取项目名称和价格（例如，在示例URL中，“590v3 Trail”和第一个项目的120.00美元）。谢谢Daniel。但是如果我用“//div”替换最后一行（而不是试图将其限制为“product”）然后我得到了结果-因此事情必须发展到这一点，例如，对于ul_product_list_main.xpath（“//li[@class=“tile”]”）中的li_tile（“//div”）：…打印li_tile.xpath（“//div”）。。。[，P.S.对格式设置表示歉意-但在评论中找不到更好的方法非常感谢您的解释。很高兴了解原因-也感谢您提供css选择器提示。干杯：）非常感谢您的解释。很高兴了解原因-也感谢您提供css选择器提示。干杯：）