Python 2.7 刮刀刮不动
我可以用漂亮的汤和机械化的工具来运行python,但由于某些原因,当我尝试使用喷雾刮刀时,它就是不起作用。下面是一个示例,演示了当我尝试使用教程测试刮板时发生的情况: 项目名称&BOT名称=教程 以下脚本是我使用的items.py和settings.py items.py 设置.py 指令Python 2.7 刮刀刮不动,python-2.7,web-scraping,scrapy,screen-scraping,scrapy-spider,Python 2.7,Web Scraping,Scrapy,Screen Scraping,Scrapy Spider,我可以用漂亮的汤和机械化的工具来运行python,但由于某些原因,当我尝试使用喷雾刮刀时,它就是不起作用。下面是一个示例,演示了当我尝试使用教程测试刮板时发生的情况: 项目名称&BOT名称=教程 以下脚本是我使用的items.py和settings.py items.py 设置.py 指令 问题是您正在将spider放入items.py 相反,创建一个包spider,在其中创建一个dmoz.py并将您的spider放入其中 请参阅本教程第段中的更多内容 import scrapy class
问题是您正在将spider放入items.py 相反,创建一个包spider,在其中创建一个dmoz.py并将您的spider放入其中 请参阅本教程第段中的更多内容
import scrapy
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
filename = response.url.split("/")[-2]
with open(filename, 'wb') as f:
f.write(response.body)
BOT_NAME = 'tutorial'
SPIDER_MODULES = ['tutorial.spiders']
NEWSPIDER_MODULE = 'tutorial.spiders'
C:\Users\Turbo>scrapy startproject tutorial
New Scrapy project 'tutorial' created in:
C:\Users\Turbo\tutorial
You can start your first spider with:
cd tutorial
scrapy genspider example example.com
C:\Users\Turbo>cd tutorial
C:\Users\Turbo\tutorial>scrapy crawl dmoz
Traceback (most recent call last):
File "C:\Python27\Scripts\scrapy-script.py", line 9, in <module>
load_entry_point('scrapy==0.24.4', 'console_scripts', 'scrapy')()
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 89, in _run_print_help
func(*a, **kw)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\cmdline.py"
, line 150, in _run_command
cmd.run(args, opts)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\commands\cr
awl.py", line 58, in run
spider = crawler.spiders.create(spname, **opts.spargs)
File "C:\Python27\lib\site-packages\scrapy-0.24.4-py2.7.egg\scrapy\spidermanag
er.py", line 44, in create
raise KeyError("Spider not found: %s" % spider_name)
KeyError: 'Spider not found: dmoz'