Python 如何在抓取页面时进行循环？_Python_Scrapy

Python 如何在抓取页面时进行循环？

python scrapy

Python 如何在抓取页面时进行循环？,python,scrapy,Python,Scrapy,我正在刮一页，但有个问题。我不想在函数中反复打印items['parations']=response.css（'p:：text'）.extract（）。相反，我想做一个循环。我试了好几次，但都失败了。这是密码 def parse_about(self, response): # do your stuff on a page items = response.meta['items'] names = {'name1':'Headings','name2':'Parag

我正在刮一页，但有个问题。我不想在函数中反复打印items['parations']=response.css（'p:：text'）.extract（）。相反，我想做一个循环。我试了好几次，但都失败了。这是密码

def parse_about(self, response):
    # do your stuff on a page
    items = response.meta['items']
    names = {'name1':'Headings','name2':'Paragraphs'}
    finder = {'find1':'h2::text , #mainContent h1::text','find2':'p::text'}
    for name in names.values():
        for find in finder.values():
            items[name] = response.css(find).extract()
            yield items

你能描述一下，你想要得到什么样的输出吗

据我所知，您可以将

zip

应用于DICT，它将合并您的值，并以更清晰的方式使迭代成为可能。最好在周期结束时生产出产品

def parse_about(self, response):
    # do your stuff on a page
    items = response.meta['items']
    names = {'name1':'Headings','name2': 'Paragraphs'}
    finder = {'find1':'h2::text , #mainContent h1::text', 'find2': 'p::text'}
    for name, find in zip(names.values(), finder.values()):
        items[name] = response.css(find).extract()
    yield items

或者，为什么不从一开始就写下正确的格言呢

def parse_about(self, response):
    # do your stuff on a page
    items = response.meta['items']
    dct = {
        'Headings': 'h2::text , #mainContent h1::text',
        'Paragraphs': 'p::text',
    }
    for name, find in dct.iteritems():
        items[name] = response.css(find).extract()
    yield items

我不想在items变量之后写

items['parations']=response.css（'p:：text'）.extract（）

等等，而是想做一个循环，这样我就不能一次又一次地打印这些行来刮取页面的特定部分。在这里，我打印标题和文本的Dmoz网站的第一页。