开始时的lambda比起来是scrapy

开始时的lambda比起来是scrapy,scrapy,python-3.5,Scrapy,Python 3.5,有人能告诉我为什么parse()中的索引变量的数量始终是10013吗 class GetsourcesSpider(scrapy.Spider): name = 'getSources' allowed_domains = ['bizhi.feihuo.com'] base_url = 'http://bizhi.feihuo.com/wallpaper/share?rsid={index}/' def start_requests(self): for index in range(

有人能告诉我为什么parse()中的索引变量的数量始终是10013吗

class GetsourcesSpider(scrapy.Spider):
name = 'getSources'
allowed_domains = ['bizhi.feihuo.com']
base_url = 'http://bizhi.feihuo.com/wallpaper/share?rsid={index}/'

def start_requests(self):
    for index in range(10010, 10014):#11886
        yield scrapy.Request(url=self.base_url.format(index=index), callback=lambda response:self.parse(response,index))

def parse(self, response, index):
    video_label = response.xpath('//video')[0]
    item = DynamicdesktopItem()
    item['index'] = index # response.url[-6:-1]
    item['video'] = video_label.attrib['src']
    item['image'] = video_label.attrib['poster']
    yield item

这是因为您给出的是
索引
变量引用,而不是值,这就是您得到最后一个值的原因。您需要使用
meta
对象进行相同的操作。请参阅下面的更新代码

class GetsourcesSpider(scrapy.Spider):
    name = 'getSources'
    allowed_domains = ['bizhi.feihuo.com']
    base_url = 'http://bizhi.feihuo.com/wallpaper/share?rsid={index}/'

    def start_requests(self):
        for index in range(10010, 10014):#11886
            yield scrapy.Request(url=self.base_url.format(index=index), callback=self.parse, meta = {'index': index})

    def parse(self, response):
        index = response.meta['index']
        video_label = response.xpath('//video')[0]
        item = DynamicdesktopItem()
        item['index'] = index # response.url[-6:-1]
        item['video'] = video_label.attrib['src']
        item['image'] = video_label.attrib['poster']
        yield item

因为从所有lambda引用的
索引
变量不会复制到它们的局部范围。它在下一次循环迭代中被重写。 考虑这个片段:

lambdas = []
for i in range(3):
    lambdas.append(lambda: print(i))
for fn in lambdas:
    fn()
这将打印三个2,即
i
的最后一个值

您应该利用请求类的
meta=
关键字,而不是执行lambda回调: