Python 为什么在生成请求时不调用自定义回调,而是调用parse方法?
我想在中导航页面,我在下面编写了代码 页面导航.py:Python 为什么在生成请求时不调用自定义回调,而是调用parse方法?,python,web-crawler,scrapy,Python,Web Crawler,Scrapy,我想在中导航页面,我在下面编写了代码 页面导航.py: #! /usr/bin/env python # -*- coding: utf-8 -*- from scrapy.spider import Spider from scrapy.selector import Selector from scrapy.http import Request class pageNaviSpider(Spider): name = 'navi' start_urls = ['http
#! /usr/bin/env python
# -*- coding: utf-8 -*-
from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.http import Request
class pageNaviSpider(Spider):
name = 'navi'
start_urls = ['https://itunes.apple.com/us/genre/ios-books/id6018?mt=8&letter=A&page=1#page']
def parse(self, response):
print 'response from: ', response.url
self.parseLink(response)
def parseLink(self, response):
print 'response from: ', response.url
sel = Selector(response)
for url in sel.xpath("//a[@class='paginate-more']/@href").extract():
yield Request(url, callback=self.parseLink)
#! /usr/bin/env python
# -*- coding: utf-8 -*-
from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.http import Request
class pageNaviSpider(Spider):
name = 'navi2'
start_urls = ['https://itunes.apple.com/us/genre/ios-books/id6018?mt=8&letter=A&page=1#page']
def parse(self, response):
print 'response from: ', response.url
sel = Selector(response)
for url in sel.xpath("//a[@class='paginate-more']/@href").extract():
yield Request(url, callback=self.parseLink)
上面的python代码不起作用。但是,我在下面编写了另一个spider代码,尽管它运行良好。我不知道为什么。你有什么建议吗
页面导航2.py:
#! /usr/bin/env python
# -*- coding: utf-8 -*-
from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.http import Request
class pageNaviSpider(Spider):
name = 'navi'
start_urls = ['https://itunes.apple.com/us/genre/ios-books/id6018?mt=8&letter=A&page=1#page']
def parse(self, response):
print 'response from: ', response.url
self.parseLink(response)
def parseLink(self, response):
print 'response from: ', response.url
sel = Selector(response)
for url in sel.xpath("//a[@class='paginate-more']/@href").extract():
yield Request(url, callback=self.parseLink)
#! /usr/bin/env python
# -*- coding: utf-8 -*-
from scrapy.spider import Spider
from scrapy.selector import Selector
from scrapy.http import Request
class pageNaviSpider(Spider):
name = 'navi2'
start_urls = ['https://itunes.apple.com/us/genre/ios-books/id6018?mt=8&letter=A&page=1#page']
def parse(self, response):
print 'response from: ', response.url
sel = Selector(response)
for url in sel.xpath("//a[@class='paginate-more']/@href").extract():
yield Request(url, callback=self.parseLink)
您应该更改:
def parse(self, response):
print 'response from: ', response.url
self.parseLink(response)
为此:
def parse(self, response):
print 'response from: ', response.url
for item in self.parseLink(response):
yield item
如果没有返回/产生语句,函数将返回无。您应该更改:
def parse(self, response):
print 'response from: ', response.url
self.parseLink(response)
为此:
def parse(self, response):
print 'response from: ', response.url
for item in self.parseLink(response):
yield item
如果没有返回/产生语句,函数将返回无。您应该更改:
def parse(self, response):
print 'response from: ', response.url
self.parseLink(response)
为此:
def parse(self, response):
print 'response from: ', response.url
for item in self.parseLink(response):
yield item
如果没有返回/产生语句,函数将返回无。您应该更改:
def parse(self, response):
print 'response from: ', response.url
self.parseLink(response)
为此:
def parse(self, response):
print 'response from: ', response.url
for item in self.parseLink(response):
yield item
如果没有
return/yield
语句,函数将返回None
。哦,是的,parse
方法必须返回项目列表或请求列表。parse
也可以返回self.parseLink(response)
。谢谢!哦,是的,parse
方法必须返回项目列表或请求列表。parse
也可以返回self.parseLink(response)
。谢谢!哦,是的,parse
方法必须返回项目列表或请求列表。parse
也可以返回self.parseLink(response)
。谢谢!哦,是的,parse
方法必须返回项目列表或请求列表。parse
也可以返回self.parseLink(response)
。谢谢!