Python 用scrapy写一只蜘蛛,但为什么';收益项目';在嵌套for循环中不工作?
我有一个爬行器在scrapy中编写,但yiled项并没有在for循环中执行,请参见下面的代码Python 用scrapy写一只蜘蛛,但为什么';收益项目';在嵌套for循环中不工作?,python,scrapy,yield,Python,Scrapy,Yield,我有一个爬行器在scrapy中编写,但yiled项并没有在for循环中执行,请参见下面的代码 def parse_paragraph(self, div_list, category_name, group_name): for div in div_list: duilian_text_list = div.xpath('./text()').extract() duilian_text_list = strip_list(duilian_text_li
def parse_paragraph(self, div_list, category_name, group_name):
for div in div_list:
duilian_text_list = div.xpath('./text()').extract()
duilian_text_list = strip_list(duilian_text_list)
if len(duilian_text_list) == 0:
continue
elif len(duilian_text_list) == 1:
duilian_text = duilian_text_list[0]
self.parse_duilian(duilian_text, category_name, group_name)
elif len(duilian_text_list) == 2 and not is_single_line(duilian_text_list[0]):
duilian_text = ''.join(duilian_text_list)
self.parse_duilian(duilian_text, category_name, group_name)
else:
for duilian_text in duilian_text_list:
duilian_item = DuilianItem()
duilian_item['id'] = str(uuid.uuid4()).replace('-', '')
duilian_item['category_id'] = getCategoryName(category_name)
duilian_item['group_name'] = group_name
duilian = parse_duilian(duilian_text)
if duilian != '|':
duilian_item['name'] = duilian
duilian_item['desc'] = ''
duilian_item['author'] = ''
duilian_item['shuti'] = ''
duilian_item['word_count'] = len(duilian_item['name']) // 2
duilian_item['image_url'] = ''
print('-------I am here--------')
yield duilian_item
当我调用这个函数时,我在输出窗口中什么也没有得到,似乎行yiled duilian_item
不起作用,它甚至阻止其他代码执行(它上面的打印行)
当我注释掉最后一行yiled duilian_item
,一切正常,我得到了输出窗口中的----我在这里--
,这里出了什么问题
简单地说,下面的代码不打印任何内容,但是如果我注释掉yiled 1
,它会打印列表,所以python中的yield不会在for循环中工作吗
def strange_yield():
list = [1, 2, 3]
for i in list:
print(i)
yield 1
strange_yield()
在python函数中使用yield时,该函数将成为生成器函数。按照
函数处理此问题的正确方法是:
my_yield = strange_yield()
my_yield现在是生成器函数的一个实例奇怪的_yield
。生成器函数可以通过使用next()
函数进行迭代或提取下一个值:
print(next(my_yield))
或
那么,我能用我的蜘蛛做什么呢?在scrapy中,最后一个解析方法需要生成一个项,我不只是想打印它。有时我想调试它,但脱胶器无法进入代码。parse_段落
不是问题,而是调用它的函数。你能把它寄出去吗?另外,请看@Gallaecio你是对的,在这里发布另一个帖子,等待你的帮助。对于有相同问题的人,请参见[此处]。
for yield_value in my_yield:
print(yield_value)