Python 刮擦：无法正确刮擦UPC编号_Python_Scrapy

Python 刮擦：无法正确刮擦UPC编号

python scrapy

Python 刮擦：无法正确刮擦UPC编号,python,scrapy,Python,Scrapy,我在百思买的网页上提取UPC代码时遇到问题。当我提取UPC代码时，我注意到UPC将移动每个产品页面上的行，这将导致我提取错误的数据以下是UPC在产品网页上移动行的示例产品1:CCS选择器显示。行：第n个子项（8）。col-xs-6.v-fw-regular 产品2:CCS选择器显示。行：第n个子项（12）。col-xs-6.v-fw-regular 这里有两个网站可以查看这是我的密码 import scrapy from ..item import QuotestutorialItem

我在百思买的网页上提取UPC代码时遇到问题。当我提取UPC代码时，我注意到UPC将移动每个产品页面上的行，这将导致我提取错误的数据

以下是UPC在产品网页上移动行的示例

产品1:CCS选择器显示

。行：第n个子项（8）。col-xs-6.v-fw-regular

产品2:CCS选择器显示

。行：第n个子项（12）。col-xs-6.v-fw-regular

这里有两个网站可以查看

这是我的密码

import scrapy
from ..item import QuotestutorialItem


class QuotesSpider(scrapy.Spider):
    name = 'bestbuy'
    start_urls = ['https://www.bestbuy.com/site/promo/newly-discounted-outlet-products']

    def parse(self, response):
        products = response.css('div.sku-title')

        for product in products:
            item = QuotestutorialItem()
            item['title'] = product.css('div.sku-title a::text').extract()

            another_page = product.css('div.sku-title a::attr(href)').get()
            if another_page:
                yield response.follow(another_page, callback=self.parse_price, meta={'item': item})
            else:
                yield item

    def parse_price(self, product):
        item = product.meta['item']
        item['price'] = product.css('.priceView-layout-large .priceView-customer-price span::text').extract()[1]
        item['upc'] = product.css('.row:nth-child(10) .col-xs-6.v-fw-regular::text').extract()
        yield item

您是否尝试过负值

-1

？或者将所有行作为python的列表，并使用

list[-1]

获取列表中的最后一个元素。那是个好主意，让我试试看。它奏效了！我完全忘了试一下。非常感谢你