Python 无法在scrapy中的解析回调中发送请求
我有一门课要刮取一些数据:Python 无法在scrapy中的解析回调中发送请求,python,scrapy,Python,Scrapy,我有一门课要刮取一些数据: class SiteSpider(scrapy.Spider): name = "somesite" start_urls = ['https://www.somesite.com'] def start_requests(self): parser = CommentParser() urls = ['https://www.somesite.com'] for url i
class SiteSpider(scrapy.Spider):
name = "somesite"
start_urls = ['https://www.somesite.com']
def start_requests(self):
parser = CommentParser()
urls = ['https://www.somesite.com']
for url in urls:
yield scrapy.Request(url=url, callback=parser.scrape)
在CommentParser类中,我有:
class CommentParser():
def scrape(self, response):
print("from CommentParser.scrape =>", response.url)
for i in range(5):
yield scrapy.Request(url="https://www.somesite.com/comments/?page=%d" % i, callback=self.parse)
def parse(self,response):
print("from CommentParser.parse => ", response.url)
yield dict(response_url = response.url)
但是scrapy没有在CommentParser类中发送请求,因此我无法在CommentParser中获得响应。parse您必须使用OOP,注意
SiteSpider(CommentParser):
这意味着SiteSpider
将访问CommentParser
的方法
class CommentParser(scrapy.Spider):
def scrape(self, response):
print("from CommentParser.scrape =>", response.url)
for i in range(5):
yield scrapy.Request(url="https://www.somesite.com/comments/?page=%d" % i, callback=self.parse)
def parse(self,response):
print("from CommentParser.parse => ", response.url)
yield dict(response_url = response.url)
class SiteSpider(CommentParser):
name = "somesite"
start_urls = ['https://www.somesite.com']
def start_requests(self):
urls = ['https://www.somesite.com']
for url in urls:
yield scrapy.Request(url=url, callback=self.scrape) #This will call CommentParser's scrape method
有什么特别的原因需要这样的类吗?是的,我想在一个类中解析站点的每个部分。例如,一个类用于评论,另一个类用于标题,另一个类用于正文,等等。