Python Scrapy既不显示任何错误,也不获取任何数据

Python Scrapy既不显示任何错误,也不获取任何数据,python,web-scraping,scrapy,Python,Web Scraping,Scrapy,尝试使用scrapy解析网站上的产品名称和价格。然而,当我运行我的scrapy代码时,它既不显示任何错误,也不获取任何数据。我做错的事超出了我的能力去发现。希望有人来调查一下 items.py包括: import scrapy class SephoraItem(scrapy.Item): Name = scrapy.Field() Price = scrapy.Field() 名为sephorasp.py的spider文件包含: from scrapy.contrib.spi

尝试使用scrapy解析网站上的产品名称和价格。然而,当我运行我的scrapy代码时,它既不显示任何错误,也不获取任何数据。我做错的事超出了我的能力去发现。希望有人来调查一下

items.py包括:

import scrapy
class SephoraItem(scrapy.Item):
    Name = scrapy.Field()
    Price = scrapy.Field()
名为sephorasp.py的spider文件包含:

from scrapy.contrib.spiders import CrawlSpider, Rule 
from scrapy.linkextractors import LinkExtractor

class SephoraspSpider(CrawlSpider):
    name = "sephorasp"
    allowed_domains = ['sephora.ae']
    start_urls = ["https://www.sephora.ae/en/stores/"]
    rules = [
            Rule(LinkExtractor(restrict_xpaths='//li[@class="level0 nav-1 active first touch-dd  parent"]')),
            Rule(LinkExtractor(restrict_xpaths='//li[@class="level2 nav-1-1-1 active first"]'),
            callback="parse_item")
    ]

    def parse_item(self, response):
        page = response.xpath('//div[@class="product-info"]')
        for titles in page:
            Product = titles.xpath('.//a[@title]/text()').extract()
            Rate = titles.xpath('.//span[@class="price"]/text()').extract()
            yield {'Name':Product,'Price':Rate}
以下是指向日志的链接:

当我与BaseSpider一起玩时,它会起作用:

from scrapy.spider import BaseSpider
from scrapy.http.request import Request

class SephoraspSpider(BaseSpider):
    name = "sephorasp"
    allowed_domains = ['sephora.ae']
    start_urls = [
                    "https://www.sephora.ae/en/travel-size/make-up",
                    "https://www.sephora.ae/en/perfume/women-perfume",
                    "https://www.sephora.ae/en/makeup/eye/eyeshadow",
                    "https://www.sephora.ae/en/skincare/moisturizers",
                    "https://www.sephora.ae/en/gifts/palettes"

    ]

    def pro(self, response):
        item_links = response.xpath('//a[contains(@class,"level0")]/@href').extract()
        for a in item_links:
            yield Request(a, callback = self.end)

    def end(self, response):
        item_link = response.xpath('//a[@class="level2"]/@href').extract()
        for b in item_link:
            yield Request(b, callback = self.parse)

    def parse(self, response):
        page = response.xpath('//div[@class="product-info"]')
        for titles in page:
            Product= titles.xpath('.//a[@title]/text()').extract()
            Rate= titles.xpath('.//span[@class="price"]/text()').extract()
            yield {'Name':Product,'Price':Rate}

你的XPath有严重缺陷

Rule(LinkExtractor(restrict_xpaths='//li[@class="level0 nav-1 active first touch-dd  parent"]')),
Rule(LinkExtractor(restrict_xpaths='//li[@class="level2 nav-1-1-1 active first"]'),
您正在匹配整个类范围,这些范围可以在任何时候更改,并且在scrapy中的顺序可能不同。只需选择一个类,它很可能足够独特:

Rule(LinkExtractor(restrict_xpaths='//li[contains(@class,"level0")]')),
Rule(LinkExtractor(restrict_xpaths='//li[contains(@class,"level2")]')),

你能把爬网日志贴出来吗?您可以通过scrapy crawl spider-s LOG_FILE=output.LOG或scrapy crawl spider&>output.LOG命令执行此操作。感谢Granitosaurus爵士的回复。我已经添加了你想要的。但是我无法以可搜索的格式上传。亲爱的Granitosaurus先生,我使用您更正的XPath运行了我的代码,但在我上传的控制台中出现了新的错误。您可以看到上面的图像是更新后的图像。谢谢。亲爱的龙先生,我按照你的建议抓到了日志,但无法上传。它很大。你能告诉我怎么做吗?@SMth80使用某种粘贴箱,例如上传日志和共享链接。好的,先生,我会这样做。顺便说一句,如果我使用Basespider编写代码,那么它或多或少可以工作。已经更新了上面的工作版本。不过,我想在不输入任何url的情况下进行解析。