Python Scrapy解析表并使用其类名跳过表行
我试图解析的url是Python Scrapy解析表并使用其类名跳过表行,python,dynamic,web-scraping,scrapy-spider,Python,Dynamic,Web Scraping,Scrapy Spider,我试图解析的url是 我想迭代类为“row r1”和“row r0”的行,但跳过类为“row”的行。发现我使用了错误的变量,行对行循环而不是行_r1我的输出基本上循环第一行[synergyspider]调试:从{'adjustedPrice':u'-','change':u'-','code':u'EGAD','date':u'2014年4月2日星期三的价目表和交易摘要,'day'u high':u'29.00','last12'u high':u'100.00','last12'u low':
我想迭代类为“row r1”和“row r0”的行,但跳过类为“row”的行。发现我使用了错误的变量,行对行循环而不是行_r1我的输出基本上循环第一行[synergyspider]调试:从{'adjustedPrice':u'-','change':u'-','code':u'EGAD','date':u'2014年4月2日星期三的价目表和交易摘要,'day'u high':u'29.00','last12'u high':u'100.00','last12'u low':u'30.00','name':u'Eaagads','percentChange'-','previous':u'29.00','Price':u'29.00','价格:u'29.00','volume','
import scrapy
from mystocks.items import MystocksItem
from scrapy.selector import Selector
import datetime
class Synergyspider(scrapy.Spider):
name = "synergyspider"
allowed_domains = ["http://live.mystocks.co.ke/price_list/"]
def parse(self, response):
sel = Selector(response)
head = sel.xpath('//*[@id="main"]/h2')
rows_r1 = sel.xpath('//tr[@class = "row r1"]')
items = []
for row in rows_r1:
item = MystocksItem()
item['date'] = head.xpath('text()').extract()[0]
item['code'] = rows_r1.xpath('./td[1]/a/text()').extract()[0]
item['name'] = rows_r1.xpath('./td[2]/text()').extract()[0]
item['last12_low'] = rows_r1.xpath('./td[3]/text()').extract()[0]
item['last12_high'] = rows_r1.xpath('./td[4]/text()').extract()[0]
#item['day_low'] = rows_r1.xpath('./td[5]/text()').extractf()[0]
item['day_high'] = rows_r1.xpath('./td[6]/text()').extract()[0]
item['price'] = rows_r1.xpath('./td[7]/text()').extract()[0]
item['previous'] = rows_r1.xpath('./td[8]/text()').extract()[0]
item['change'] = rows_r1.xpath('./td[9]/text()').extract()[0]
item['percentChange'] = rows_r1.xpath('./td[10]/text()').extract()[0]
item['volume'] = rows_r1.xpath('./td[12]/text()').extract()[0]
item['adjustedPrice'] = rows_r1.xpath('./td[13]/text()').extract()[0]
items.append(item)
return items