Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/visual-studio-2008/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 不调用Scrapy、custome方法_Python_Scrapy - Fatal编程技术网

Python 不调用Scrapy、custome方法

Python 不调用Scrapy、custome方法,python,scrapy,Python,Scrapy,当我用scrapy解析网页时遇到了一个问题,我的custome方法没有被scrapy调用。url为:,代码为: import scrapy from shufa.items import DuilianItem class DuilianSpiderSpider(scrapy.Spider): name = 'duilian_spider' start_urls = [ {"url": "http://www.duilian360.com/chunjie/117

当我用scrapy解析网页时遇到了一个问题,我的custome方法没有被scrapy调用。url为:,代码为:

import scrapy
from shufa.items import DuilianItem

class DuilianSpiderSpider(scrapy.Spider):
    name = 'duilian_spider'
    start_urls = [
        {"url": "http://www.duilian360.com/chunjie/117.html", "category_name": "春联", "group_name": "鼠年春联"},
    ]
    base_url = 'http://www.duilian360.com'

    def start_requests(self):
        for topic in self.start_urls:
            url = topic['url']
            yield scrapy.Request(url=url)

    def parse(self, response):
        div_list = response.xpath("//div[@class='contentF']/div[@class='content_l']/p")
        self.parse_paragraph(div_list)

    def parse_paragraph(self, div_list):
        for div in div_list:
            duilian_text_list = div.xpath('./text()').extract()
            for duilian_text in duilian_text_list:
                duilian_item = DuilianItem()
                duilian_item['category_id'] = 1
                duilian = duilian_text
                duilian_item['name'] = duilian
                duilian_item['desc'] = ''
                print('I reach here...')
                yield duilian_item
在上面的代码中,没有调用方法
parse_paration
,因为
print
子句没有输出,所以即使在打印行上设置了断点,我也无法使用此方法

但是,如果我将方法
parse_段落
中的所有代码移动到调用方法
parse_页面
中,那么一切都很好,为什么

# -*- coding: utf-8 -*-
import scrapy
from shufa.items import DuilianItem

class DuilianSpiderSpider(scrapy.Spider):
    name = 'duilian_spider'
    start_urls = [
        {"url": "http://www.duilian360.com/chunjie/117.html", "category_name": "春联", "group_name": "鼠年春联"},
    ]
    base_url = 'http://www.duilian360.com'

    def start_requests(self):
        for topic in self.start_urls:
            url = topic['url']
            yield scrapy.Request(url=url)

    def parse(self, response):
        div_list = response.xpath("//div[@class='contentF']/div[@class='content_l']/p")
        for div in div_list:
            duilian_text_list = div.xpath('./text()').extract()
            for duilian_text in duilian_text_list:
                duilian_item = DuilianItem()
                duilian_item['category_id'] = 1
                duilian = duilian_text
                duilian_item['name'] = duilian
                duilian_item['desc'] = ''
                print('I reach here...')
                yield duilian_item

    # def parse_paragraph(self, div_list):
    #     for div in div_list:
    #         duilian_text_list = div.xpath('./text()').extract()
    #         for duilian_text in duilian_text_list:
    #             duilian_item = DuilianItem()
    #             duilian_item['category_id'] = 1
    #             duilian = duilian_text
    #             duilian_item['name'] = duilian
    #             duilian_item['desc'] = ''
    #             print('I reach here...')
    #             yield duilian_item

我的代码有很多custome方法,我不希望将其中的所有代码都移动到调用方法。这不是一个好的做法。

我会使用产生,而不是直接调用
parse\u段落
,因为这会返回一个生成器,而不是从另一个解析器产生项目/请求

def解析(self,response):
div_list=response.xpath(“//div[@class='contentF']/div[@class='content_l']/p”)
从self.parse_段(div_列表)中生成

兄弟,你救了我!一点问题也没有