Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/310.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 刮屑:所有刮屑器都失败。带语法错误的Spider_Python_Web Scraping_Scrapy_Syntax Error - Fatal编程技术网

Python 刮屑:所有刮屑器都失败。带语法错误的Spider

Python 刮屑:所有刮屑器都失败。带语法错误的Spider,python,web-scraping,scrapy,syntax-error,Python,Web Scraping,Scrapy,Syntax Error,有时,当一个刮板中存在一些错误时,所有刮板都会失败。 例子: 我有语法错误的刮刀,但没有找到 class MySpiderWithSyntaxError(scrapy.Spider): name = "my_spider_with_syntax_error" start_urls = [ 'http://www.website.com' ] def parse(self response): for url in respons

有时,当一个刮板中存在一些错误时,所有刮板都会失败。 例子: 我有语法错误的刮刀,但没有找到

class MySpiderWithSyntaxError(scrapy.Spider):
    name = "my_spider_with_syntax_error"

    start_urls = [
        'http://www.website.com'
    ]

    def parse(self response):
        for url in response.css('a.p::attr(href)').extract():
            print url
在这一行中,蜘蛛漏掉了逗号

def parse(self response):
蜘蛛的神秘感和语法错误将会失败。 但是如果在没有语法错误的情况下运行另一个spider(spider代码如下)

我得到这样的错误:

    Traceback (most recent call last):
    File "/home/Documents/project/.env/bin/scrapy", line 11, in <module>
       sys.exit(execute())
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 141, in execute
       cmd.crawler_process = CrawlerProcess(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/crawler.py", line 238, in __init__
       super(CrawlerProcess, self).__init__(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/crawler.py", line 129, in __init__
       self.spider_loader = _get_spider_loader(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/crawler.py", line 325, in _get_spider_loader
       return loader_cls.from_settings(settings.frozencopy())
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 33, in from_settings
       return cls(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 20, in __init__
       self._load_all_spiders()
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 28, in _load_all_spiders
       for module in walk_modules(name):
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
       submod = import_module(fullpath)
    File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
       __import__(name)
    File "/home/Documents/project/scrapers/scrapy/spiders/my_spider_with_syntax_error.py", line 14
       def parse(self response):
                     ^
    SyntaxError: invalid syntax
回溯(最近一次呼叫最后一次):
文件“/home/Documents/project/.env/bin/scrapy”,第11行,在
sys.exit(execute())
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/cmdline.py”,执行中的第141行
cmd.crawler_process=CrawlerProcess(设置)
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/crawler.py”,第238行,在__
超级(爬虫进程,自我)。\uuuuu初始化\uuuuu(设置)
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/crawler.py”,第129行,在__
self.spider\u loader=\u get\u spider\u loader(设置)
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/crawler.py”,第325行,在“获取蜘蛛”加载程序中
从\u设置返回加载程序\u cls.(settings.frozencopy())
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/spiderloader.py”,第33行,在from_设置中
返回cls(设置)
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/spiderloader.py”,第20行,在__
self.\u加载\u所有\u蜘蛛()
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/spiderloader.py”,第28行,在所有蜘蛛网中
对于walk_模块中的模块(名称):
文件“/home/Documents/project/.env/local/lib/python2.7/site packages/scrapy/utils/misc.py”,第71行,在walk_模块中
子模块=导入模块(完整路径)
文件“/usr/lib/python2.7/importlib/_init_uuu.py”,第37行,在导入模块中
__导入(名称)
文件“/home/Documents/project/scrapers/scrapy/spider/my_spider_with_syntax_error.py”,第14行
def解析(自我响应):
^
SyntaxError:无效语法
问题:
是否有可能捕捉到这样的错误,并且只在有语法错误的爬行器上失败,而另一个爬行器工作正常?

如果使用Scrapy项目,那么即使运行单个爬行器(使用
Scrapy crawl
),所有爬行器模块都会加载。因此,如果其中任何一个包含语法错误,您将得到一个错误。

您无法处理编译错误,例如异常处理中的语法错误。您只能处理运行时错误。
    Traceback (most recent call last):
    File "/home/Documents/project/.env/bin/scrapy", line 11, in <module>
       sys.exit(execute())
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/cmdline.py", line 141, in execute
       cmd.crawler_process = CrawlerProcess(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/crawler.py", line 238, in __init__
       super(CrawlerProcess, self).__init__(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/crawler.py", line 129, in __init__
       self.spider_loader = _get_spider_loader(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/crawler.py", line 325, in _get_spider_loader
       return loader_cls.from_settings(settings.frozencopy())
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 33, in from_settings
       return cls(settings)
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 20, in __init__
       self._load_all_spiders()
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/spiderloader.py", line 28, in _load_all_spiders
       for module in walk_modules(name):
    File "/home/Documents/project/.env/local/lib/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
       submod = import_module(fullpath)
    File "/usr/lib/python2.7/importlib/__init__.py", line 37, in import_module
       __import__(name)
    File "/home/Documents/project/scrapers/scrapy/spiders/my_spider_with_syntax_error.py", line 14
       def parse(self response):
                     ^
    SyntaxError: invalid syntax