Python 刮痕分页-停止卡盘

Python 刮痕分页-停止卡盘,python,web-scraping,pagination,scrapy,Python,Web Scraping,Pagination,Scrapy,有人能帮我理解我在这段代码的分页中犯了什么错误吗 当我尝试通用xpath时: if len(response.xpath("//*")) == 0: raise CloseSpider('No more products to scrape...') 我得到了所有的数据,但代码没有停止 当我尝试此xpath时: if len(response.xpath("//h3[@class='shelf-product-name ']/a/@href"))

有人能帮我理解我在这段代码的分页中犯了什么错误吗

当我尝试通用xpath时:

if len(response.xpath("//*")) == 0:
    raise CloseSpider('No more products to scrape...')
我得到了所有的数据,但代码没有停止

当我尝试此xpath时:

if len(response.xpath("//h3[@class='shelf-product-name ']/a/@href")) == 0:
    raise CloseSpider('No more products to scrape...')
页面范围从0到50,理论上应返回2550个项目。(每页50项)

但当我使用第二个xpath时,它会在某个点停止,但我不知道为什么

import scrapy
from scrapy.exceptions import CloseSpider
 
 
class ProdutosSpider(scrapy.Spider):
    name = 'produtos_aplus'
    allowed_domains = ['www.allpartsnet.com.br']
    start_urls = ["https://www.allpartsnet.com.br/buscapagina?fq=B%3a1228&O=OrderByNameASC&PS=50&sl=5d58b484-137e-4091-92ca-29d2e0c70f85&cc=1&sm=0&PageNumber=0"]
    page = 0
 
    def parse(self, response):
 
        if len(response.xpath("//h3[@class='shelf-product-name ']/a/@href")) == 0:
            raise CloseSpider('No more products to scrape...')
 
        for produtos in response.xpath("//div[@class='QD prateleira row qd-xs n1colunas']/ul"):
            
            link = produtos.xpath(".//h3[@class='shelf-product-name ']/a/@href").get()
            cod_all = produtos.xpath(".//span[@class='insert-sku-name']/text()").get()            
            
            
            yield response.follow(url=link, callback=self.parse_produto, meta={'link': link, 'cod_all': cod_all})
            
 
        self.page += 1
        yield scrapy.Request(
            url=f'https://www.allpartsnet.com.br/buscapagina?fq=B%3a1228&O=OrderByNameASC&PS=50&sl=5d58b484-137e-4091-92ca-29d2e0c70f85&cc=1&sm=0&PageNumber={self.page}',
            callback=self.parse
        )

        
    def parse_produto(self, response):
        link = response.request.meta['link']
        cod_all = response.request.meta['cod_all']        
        for produtos in response.xpath("//div[@class='vehicle-selection']/div[@id='caracteristicas']"):
            yield{

                'link': link,
                'cod_all': cod_all,                
                'fabricante': produtos.xpath(".//td[@class='value-field Fabricante']/text()").get(),
                'ean': produtos.xpath(".//td[@class='value-field Codigo-EAN']/text()").get(),
                'oem': produtos.xpath(".//td[@class='value-field Codigo-OEM']/text()").get()
                
            }