Scrapy 我做错了什么？我想让我的爬行器使用URL爬行到下一页_Scrapy

Scrapy 我做错了什么？我想让我的爬行器使用URL爬行到下一页

scrapy

Scrapy 我做错了什么？我想让我的爬行器使用URL爬行到下一页,scrapy,Scrapy,我是个新手。我正在编写简单的教程。除了我不能爬到下一页外，一切正常 import scrapy 类报价器（刮板蜘蛛）： name=“quotes” 起始URL=['http://quotes.toscrape.com'] 允许的_域=[“quotes.toscrape.com”] def解析（自我，响应）：对于response.xpath（“//div[@class=“quote”]”）中的响应：产量{ “quote”：response.xpath（'./span[@class=“text

我是个新手。我正在编写简单的教程。除了我不能爬到下一页外，一切正常

import scrapy
类报价器（刮板蜘蛛）：
name=“quotes”
起始URL=['http://quotes.toscrape.com']
允许的_域=[“quotes.toscrape.com”]
def解析（自我，响应）：
对于response.xpath（“//div[@class=“quote”]”）中的响应：
产量{
“quote”：response.xpath（'./span[@class=“text”]/text（））.extract（），
“author”：response.xpath（'./span/small[@class=“author”]/text（））.extract（），
“tag”：response.xpath（'./div[@class=“tags”]/a/text（））.extract（）
}
next_page=response.xpath（'//nav/ul[@class=“pager”]/li[@class=“next”]/a/@href'）。extract_first（）
如果下一页不是“无”：
下一页url=response.urljoin（下一页）
生成scrapy.Request（url=next\u page\u url，callback=self.parse）

我的错误消息：

下一页url=response.urljoin（下一页）

AttributeError:“Selector”对象没有属性“urljoin”

问题是您正在使用for循环覆盖响应对象。因此，for循环中的内部响应对象的类型是

spidy.language.path\u node.PathNode

，它不包含urljoin的定义。这应该能解决你的问题

for response_path in response.xpath('//div[@class="quote"]'):
   yield { 
   "quote":response_path.xpath('./span[@class="text"]/text()').extract(),
        "author" : response_path.xpath('./span/small[@class="author"]/text()').extract(),
        "tag" : response_path.xpath('./div[@class="tags"]/a/text()').extract()
         }
    next_page = response_path.xpath('//nav/ul[@class="pager"]/li[@class="next"]/a/@href').extract_first()
    if next_page is not None:
        next_page_url = response.urljoin(next_page)
        yield scrapy.Request(url=next_page_url,callback=self.parse)

如果我是正确的，我的问题是我的下一个页面定义在for循环中？谢谢，请标记为答案。将

next\u page=…

后面的行从for循环中取出同样有效，但会产生不同的结果，这取决于您的响应内容/您想做什么。但是我的代码根本没有覆盖响应对象。。。