Web scraping 如何使用Scrapy使用相同的解析刮取多个URL？_Web Scraping_Scrapy_Scrape

Web scraping 如何使用Scrapy使用相同的解析刮取多个URL？

web-scraping scrapy

Web scraping 如何使用Scrapy使用相同的解析刮取多个URL？,web-scraping,scrapy,scrape,Web Scraping,Scrapy,Scrape,您好，我有一个关于我的蜘蛛脚本的问题，我想让我的脚本尽可能可读，我想保存代码尽可能多。是否可以在不同的URL上使用相同的解析我只想每页刮取10个项目，并将其保存在items.py的不同项目功能中这是我的密码 def start_requests(self): #I have 3 URL's Here yield scrapy.Request('https://teslamotorsclub.com/tmc/post-ratings/6/posts', self.parse) #Ur

您好，我有一个关于我的蜘蛛脚本的问题，我想让我的脚本尽可能可读，我想保存代码尽可能多。是否可以在不同的URL上使用相同的解析

我只想每页刮取10个项目，并将其保存在

items.py的不同项目功能中
这是我的密码
def start_requests(self):  #I have 3 URL's Here
    yield scrapy.Request('https://teslamotorsclub.com/tmc/post-ratings/6/posts', self.parse) #Url 1
    yield scrapy.Request('https://teslamotorsclub.com/tmc/post-ratings/7/posts', self.parse) #Url 2
    yield scrapy.Request('https://teslamotorsclub.com/tmc/post-ratings/1/posts', self.parse) #Url 3


def parse(self, response): #My logic is something like this    
    if Url == Url1:
        item = TmcnfSpiderItem()
    elif Url == Url2:
        item = TmcnfSpiderItem2()
    elif Url == Url3:
        item = TmcnfSpiderItem3()

    if count <= 9:
        count += 1
        info = response.css("[id^='fc-post-" + postno_only +"']")
        author = info.xpath("@data-author").extract_first()
        item['author'] = author      
        yield item
     else:
         #Move to next URL and perform same parse

def start_请求（self）：#我这里有3个URL
产生刮痕。请求（'https://teslamotorsclub.com/tmc/post-ratings/6/posts“，self.parse）#Url 1
产生刮痕。请求（'https://teslamotorsclub.com/tmc/post-ratings/7/posts“，self.parse）#Url 2
产生刮痕。请求（'https://teslamotorsclub.com/tmc/post-ratings/1/posts“，self.parse）#Url 3
def parse（self，response）：#我的逻辑是这样的
如果Url==Url1：
item=TmcnfSpiderItem（）
elif Url==Url2:
item=TmcnfSpiderItem2（）
elif Url==Url3：
item=TmcnfSpiderItem3（）
如果count我认为您可以尝试从start\u请求中传递所有数据，如下所示：
def start_requests(self):
    urls = (
        ('https://teslamotorsclub.com/tmc/post-ratings/6/posts', TmcnfSpiderItem),
        ('https://teslamotorsclub.com/tmc/post-ratings/7/posts', TmcnfSpiderItem2),
        ('https://teslamotorsclub.com/tmc/post-ratings/1/posts', TmcnfSpiderItem3),
    )
    for url, itemclass in urls:
        yield scrapy.Request(url, meta={'itemclass': itemclass}) 

def parse(self, response):
    item = response.meta['itemclass']()

因此，您可以为每个url传递项目类名，并在parse
函数中创建该类的新元素。
我认为您可以尝试传递start\u请求中的所有数据，如下所示：
def start_requests(self):
    urls = (
        ('https://teslamotorsclub.com/tmc/post-ratings/6/posts', TmcnfSpiderItem),
        ('https://teslamotorsclub.com/tmc/post-ratings/7/posts', TmcnfSpiderItem2),
        ('https://teslamotorsclub.com/tmc/post-ratings/1/posts', TmcnfSpiderItem3),
    )
    for url, itemclass in urls:
        yield scrapy.Request(url, meta={'itemclass': itemclass}) 

def parse(self, response):
    item = response.meta['itemclass']()

因此，您为每个url传递项目类名称，并在parse
函数中创建该类的新元素