aria标签的Python Scrapy提取值
我是Scrapy的新手,我正在尝试刮取一个类上有aria标签的页面:aria标签的Python Scrapy提取值,python,scrapy,Python,Scrapy,我是Scrapy的新手,我正在尝试刮取一个类上有aria标签的页面: <body> <div class="item-price" aria-label="$1.99"> ..... </div> </body> 运行spider时,出现以下错误: 2018-09-02 18:34:03 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict
<body>
<div class="item-price" aria-label="$1.99">
.....
</div>
</body>
运行spider时,出现以下错误:
2018-09-02 18:34:03 [scrapy.core.scraper] ERROR: Spider must return Request, BaseItem, dict or None, got 'list' in <GET https://example.com/test.html>
2018-09-02 18:34:03[scrapy.core.scraper]错误:Spider必须返回请求、BaseItem、dict或None,在中获取“列表”
如何在此处提取aria标签的值?您的代码中有几个错误:
def parse(self, response):
item = {}
item["price"] = response.xpath('//div[@class="item-price"]/@aria-label').extract_first()
yield item
如果要使用css提取器而不是xpath:
def parse(self, response):
item = {response.css('div.item-price::attr(aria-label)').extract_first()}
yield item
def parse(self, response):
item = {response.css('div.item-price::attr(aria-label)').extract_first()}
yield item