Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/67.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scrapy-空闲信号十字轴遇到错误_Scrapy_Signals - Fatal编程技术网

Scrapy-空闲信号十字轴遇到错误

Scrapy-空闲信号十字轴遇到错误,scrapy,signals,Scrapy,Signals,我正在尝试创建一个spider,它一直在运行,当它到达空闲状态时,它应该从数据库中获取下一个要解析的url。 不幸的是,我一开始就得到了stack: # -*- coding: utf-8 -*- import scrapy from scrapy import signals from scrapy import Spider import logging class SignalspiderSpider(Spider): name = 'signalspider' al

我正在尝试创建一个spider,它一直在运行,当它到达空闲状态时,它应该从数据库中获取下一个要解析的url。 不幸的是,我一开始就得到了stack:

# -*- coding: utf-8 -*-
import scrapy

from scrapy import signals
from scrapy import Spider

import logging

class SignalspiderSpider(Spider):
    name = 'signalspider'
    allowed_domains = ['domain.de']

    yet = False

    def start_requests(self):
        logging.log(logging.INFO, "______ Loading requests")
        yield scrapy.Request('https://www.domain.de/product1.html')

    @classmethod
    def from_crawler(cls, crawler, *args, **kwargs):
        logging.log(logging.INFO, "______ From Crawler")
        spider = spider = super(SignalspiderSpider, cls).from_crawler(crawler, *args, **kwargs)
        crawler.signals.connect(spider.idle, signal=scrapy.signals.spider_idle)
        return spider


    def parse(self, response):
        self.logger.info("______ Finished extracting structured data from HTML")
        pass

    def idle(self):
        logging.log(logging.INFO, "_______ Idle state")
        if not self.yet:
            self.crawler.engine.crawl(self.create_request(), self)
            self.yet = True


    def create_request(self):
        logging.log(logging.INFO, "_____________ Create requests")
        yield scrapy.Request('https://www.domain.de/product2.html?dvar_82_color=blau&cgid=')
我得到的错误是:

2019-03-27 21:41:38 [root] INFO: _______ Idle state
2019-03-27 21:41:38 [root] INFO: _____________ Create requests
2019-03-27 21:41:38 [scrapy.utils.signal] ERROR: Error caught on signal handler: <bound method RefererMiddleware.request_scheduled of <scrapy.spidermiddlewares.referer.RefererMiddleware object at 0x7f93bcc13978>>
Traceback (most recent call last):
  File "/home/spidy/Documents/spo/lib/python3.5/site-packages/scrapy/utils/signal.py", line 30, in send_catch_log
    *arguments, **named)
  File "/home/spidy/Documents/spo/lib/python3.5/site-packages/pydispatch/robustapply.py", line 55, in robustApply
    return receiver(*arguments, **named)
  File "/home/spidy/Documents/spo/lib/python3.5/site-packages/scrapy/spidermiddlewares/referer.py", line 343, in request_scheduled
    redirected_urls = request.meta.get('redirect_urls', [])
AttributeError: 'NoneType' object has no attribute 'meta'
2019-03-27 21:41:38[根目录]信息:空闲状态
2019-03-27 21:41:38[根目录]信息:创建请求
2019-03-27 21:41:38[scrapy.utils.signal]错误:在信号处理器上捕获到错误:
回溯(最近一次呼叫最后一次):
文件“/home/spidy/Documents/spo/lib/python3.5/site packages/scrapy/utils/signal.py”,第30行,在发送捕获日志中
*参数,**已命名)
文件“/home/spidy/Documents/spo/lib/python3.5/site packages/pydispatch/robustapply.py”,第55行,在robustapply中
返回接收器(*参数,**已命名)
文件“/home/spidy/Documents/spo/lib/python3.5/site packages/scrapy/spidermiddleware/referer.py”,第343行,在请求中
redirected_URL=request.meta.get('redirected_URL',[])
AttributeError:“非类型”对象没有属性“元”
我做错了什么?

试试:

def idle(self, spider):
    logging.log(logging.INFO, "_______ Idle state")
    if not self.yet:
        self.yet = True
        self.crawler.engine.crawl(Request(url='https://www.domain.de/product2.html?dvar_82_color=blau&cgid=', callback=spider.parse), spider)
我不确定在方法spider_idle中创建一个请求是否正确,像您一样传递另一个发出请求的方法

更多信息请访问