Scrapy response.follow查询_Scrapy_Response

Scrapy response.follow查询

scrapy

Scrapy response.follow查询,scrapy,response,Scrapy,Response,我按照本页的说明进行操作上面的例子适用于他们的页面 <ul class="pager"> <li class="next"> <a href="/page/2/">Next <span aria-hidden="true">→/span></a> </li> </ul> 或两者都不起作用，请建议，在第1页之后，将要检查第2页，然后是第3页，等等。使用XPat

我按照本页的说明进行操作

上面的例子适用于他们的页面

<ul class="pager">
<li class="next">
<a href="/page/2/">Next <span aria-hidden="true">&rarr;/span></a>
</li>             
</ul>

或

两者都不起作用，请建议，在第1页之后，将要检查第2页，然后是第3页，等等。

使用XPath非常简单：

next_page = response.xpath('//li[@class="page-current"]/following-sibling::li[1]/a/@href').get()

试试：

relative\u url=response.xpath（'//li[@class=“next”]/a/@href'）.get（）

在胶状外壳中，提供： “/page/2/”

此外：如果需要，您可以使用urljoin与关联，如下所示：

from urllib.parse import urljoin
domain = 'http://quotes.toscrape.com'
           url = urljoin(domain, relative_url)

And then use the url variable as per :

yield response.follow(url, callback=self.parse)

用extract（）代替get（）怎么样？我试过了，没用。有没有一种方法可以在url上运行for lop？page=1。url？page=2等？谢谢，我试过了，但没用。没有错误消息，只是被忽略了？@Web\u开发者请显示您的真实代码：

http://quotes.toscrape.com/page/1/

next_page = response.css('li.page-current a::attr(href)').get()

next_page = response.css('li.page-current li a::attr(href)').get()

next_page = response.xpath('//li[@class="page-current"]/following-sibling::li[1]/a/@href').get()

from urllib.parse import urljoin
domain = 'http://quotes.toscrape.com'
           url = urljoin(domain, relative_url)

And then use the url variable as per :

yield response.follow(url, callback=self.parse)