Python 2.7 带有append()问题的python scrapy爬虫
我从scrapy.org上取了这个例子。在我尝试将所有内容保存到items对象中之前,它工作得很好。append(item)显然是无效的语法,但本网站上的所有其他示例都具有相同的赋值Python 2.7 带有append()问题的python scrapy爬虫,python-2.7,scrapy,Python 2.7,Scrapy,我从scrapy.org上取了这个例子。在我尝试将所有内容保存到items对象中之前,它工作得很好。append(item)显然是无效的语法,但本网站上的所有其他示例都具有相同的赋值 from scrapy.spider import BaseSpider from scrapy.selector import HtmlXPathSelector from tutorial.items import DmozItem class DmozSpider(BaseSpider): name
from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from tutorial.items import DmozItem
class DmozSpider(BaseSpider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
hxs = HtmlXPathSelector(response)
sites = hxs.select('//ul/li')
items = []
for site in sites:
item = DmozItem()
item['title'] = site.select('a/text()').extract()
item['link'] = site.select('a/@href').extract()
item['desc'] = site.select('text()').extract()
items.append(item)
return items
错误是:
computerito@computerito-the-great ~/SHITSHOW/tutorial $ scrapy crawl dmoz
2015-03-10 22:00:40-0700 [scrapy] INFO: Scrapy 0.14.4 started (bot: tutorial)
2015-03-10 22:00:40-0700 [scrapy] DEBUG: Enabled extensions: LogStats, TelnetConsole, CloseSpider, WebService, CoreStats, MemoryUsage, SpiderState
Traceback (most recent call last):
File "/usr/bin/scrapy", line 4, in <module>
execute()
File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 132, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 97, in _run_print_help
func(*a, **kw)
File "/usr/lib/python2.7/dist-packages/scrapy/cmdline.py", line 139, in _run_command
cmd.run(args, opts)
File "/usr/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 43, in run
spider = self.crawler.spiders.create(spname, **opts.spargs)
File "/usr/lib/python2.7/dist-packages/scrapy/command.py", line 34, in crawler
self._crawler.configure()
File "/usr/lib/python2.7/dist-packages/scrapy/crawler.py", line 36, in configure
self.spiders = spman_cls.from_crawler(self)
File "/usr/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 37, in from_crawler
return cls.from_settings(crawler.settings)
File "/usr/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 33, in from_settings
return cls(settings.getlist('SPIDER_MODULES'))
File "/usr/lib/python2.7/dist-packages/scrapy/spidermanager.py", line 23, in __init__
for module in walk_modules(name):
File "/usr/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 65, in walk_modules
submod = __import__(fullpath, {}, {}, [''])
File "/home/computerito/SHITSHOW/tutorial/tutorial/spiders/dmoz_spider.py", line 25
items.append(item)
^
SyntaxError: invalid syntax
computerito@computerito-伟大的~/SHITSHOW/教程$scrapy crawl dmoz
2015-03-10 22:00:40-0700[scrapy]信息:scrapy 0.14.4已启动(机器人:教程)
2015-03-10 22:00:40-0700[scrapy]调试:启用的扩展:LogStats、TelnetConsole、CloseSpider、WebService、CoreStats、MemoryUsage、SpiderState
回溯(最近一次呼叫最后一次):
文件“/usr/bin/scrapy”,第4行,在
执行()
文件“/usr/lib/python2.7/dist packages/scrapy/cmdline.py”,执行中的第132行
_运行\u打印\u帮助(解析器、\u运行\u命令、cmd、args、opts)
文件“/usr/lib/python2.7/dist packages/scrapy/cmdline.py”,第97行,在“运行”和“打印”帮助中
func(*a,**千瓦)
文件“/usr/lib/python2.7/dist packages/scrapy/cmdline.py”,第139行,在_run_命令中
cmd.run(参数、选项)
文件“/usr/lib/python2.7/dist-packages/scrapy/commands/crawl.py”,第43行,运行中
spider=self.crawler.spider.create(spname,**opts.spargs)
文件“/usr/lib/python2.7/dist packages/scrapy/command.py”,第34行,在crawler中
self.\u crawler.configure()
文件“/usr/lib/python2.7/dist packages/scrapy/crawler.py”,第36行,在configure中
self.spider=spman\u cls.来自爬虫(self)
文件“/usr/lib/python2.7/dist packages/scrapy/spidermanager.py”,第37行,来自爬虫程序
从_设置(爬虫程序设置)返回cls
文件“/usr/lib/python2.7/dist packages/scrapy/spidermanager.py”,第33行,在from_设置中
返回cls(settings.getlist('SPIDER_MODULES'))
文件“/usr/lib/python2.7/dist packages/scrapy/spidermanager.py”,第23行,在__
对于walk_模块中的模块(名称):
文件“/usr/lib/python2.7/dist packages/scrapy/utils/misc.py”,第65行,在walk_模块中
submod=uuu导入(完整路径,{},{},['''])
文件“/home/computerito/SHITSHOW/tutorial/tutorial/spider/dmoz_spider.py”,第25行
items.append(项目)
^
SyntaxError:无效语法
您得到的确切错误是什么?您可以编辑问题以添加此信息。有了这些信息,某人更有可能帮助你。在我运行你的代码时,找不到任何错误。