Python 2.7 刮刀刮不动

Python 2.7 刮刀刮不动,python-2.7,web-scraping,scrapy,screen-scraping,scrapy-spider,Python 2.7,Web Scraping,Scrapy,Screen Scraping,Scrapy Spider,我可以用漂亮的汤和机械化的工具来运行python,但由于某些原因,当我尝试使用喷雾刮刀时,它就是不起作用。下面是一个示例,演示了当我尝试使用教程测试刮板时发生的情况: 项目名称&BOT名称=教程 以下脚本是我使用的items.py和settings.py items.py 设置.py 指令 问题是您正在将spider放入items.py 相反,创建一个包spider,在其中创建一个dmoz.py并将您的spider放入其中 请参阅本教程第段中的更多内容 import scrapy class

我可以用漂亮的汤和机械化的工具来运行python,但由于某些原因,当我尝试使用喷雾刮刀时,它就是不起作用。下面是一个示例,演示了当我尝试使用教程测试刮板时发生的情况:

项目名称&BOT名称=教程

以下脚本是我使用的items.py和settings.py

items.py

设置.py

指令


问题是您正在将spider放入items.py

相反,创建一个包spider,在其中创建一个dmoz.py并将您的spider放入其中

请参阅本教程第段中的更多内容

import scrapy

class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["dmoz.org"]
    start_urls = [
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
    ]

    def parse(self, response):
        filename = response.url.split("/")[-2]
        with open(filename, 'wb') as f:
            f.write(response.body)
BOT_NAME = 'tutorial'

SPIDER_MODULES = ['tutorial.spiders']
NEWSPIDER_MODULE = 'tutorial.spiders'
C:\Users\Turbo>scrapy startproject tutorial
New Scrapy project 'tutorial' created in:
    C:\Users\Turbo\tutorial

You can start your first spider with:
    cd tutorial
    scrapy genspider example example.com

C:\Users\Turbo>cd tutorial

C:\Users\Turbo\tutorial>scrapy crawl dmoz
Traceback (most recent call last):
  File "C:\Python27\Scripts\scrapy-script.py", line 9, in <module>
    load_entry_point('scrapy==0.24.4', 'console_scripts', 'scrapy')()
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 89, in _run_print_help
    func(*a, **kw)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 150, in _run_command
    cmd.run(args, opts)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\commands\cr
awl.py", line 58, in run
    spider = crawler.spiders.create(spname, **opts.spargs)
  File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 44, in create
    raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: dmoz'