Python和Scrapy引用项';s属性
在Scrapy的教程网站上,他们有一个项目的代码Python和Scrapy引用项';s属性,python,variables,web-scraping,scrapy,Python,Variables,Web Scraping,Scrapy,在Scrapy的教程网站上,他们有一个项目的代码 import scrapy class DmozItem(scrapy.Item): title = scrapy.Field() link = scrapy.Field() desc = scrapy.Field() 然后他们就有了蜘蛛的代码 import scrapy from tutorial.items import DmozItem class DmozSpider(scrapy.Spider):
import scrapy
class DmozItem(scrapy.Item):
title = scrapy.Field()
link = scrapy.Field()
desc = scrapy.Field()
然后他们就有了蜘蛛的代码
import scrapy
from tutorial.items import DmozItem
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
for sel in response.xpath('//ul/li'):
item = DmozItem()
item['title'] = sel.xpath('a/text()').extract()
item['link'] = sel.xpath('a/@href').extract()
item['desc'] = sel.xpath('text()').extract()
yield item
我的问题是,为什么他们可以使用[]括号引用项目的标题?我想当你引用一个变量时,它应该是item.title=随便什么。有什么我遗漏的吗?这是因为在幕后,Scrapy在课堂上使用了mixin:
class UserDict.DictMixin
Mixin为已经具有
最小字典接口,包括\uuuu getitem\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu()
,\uuuuuuuuuuuuuuuuuuu,
\uu delitem\uu()
和键()
此外,来自Scrapy's的引用:
Item对象是用于收集临时数据的简单容器。
它们提供了一个类似于字典的API,并为
声明其可用字段
另见