Python 粗糙的XML文档
我刚刚接触Python和Python 粗糙的XML文档,python,scrapy,Python,Scrapy,我刚刚接触Python和scrapy。我试图从多个XML文档中提取数据。我在这里找到了XML,它们的范围从CourseId=1一直到CourseId=4500: 例1: 例1: 我的代码如下。当我运行它时,我得到一个类型错误:必须以XMLResonse实例作为第一个参数调用unbound方法body_as_unicode() from scrapy.contrib.spiders import CrawlSpider,Rule from scrapy.contrib.linkextracto
scrapy
。我试图从多个XML文档中提取数据。我在这里找到了XML,它们的范围从CourseId=1一直到CourseId=4500:
例1:
例1:
我的代码如下。当我运行它时,我得到一个类型错误:必须以XMLResonse实例作为第一个参数调用unbound方法body_as_unicode()
from scrapy.contrib.spiders import CrawlSpider,Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.selector import XmlXPathSelector
from myProject.items import RTOData
from scrapy.http import Request
from scrapy.http import XmlResponse
class myProjectSpider(CrawlSpider):
name = 'myProject'
allowed_domains = ['myskills.gov.au']
start_urls = ['http://www.myskills.gov.au/DesktopModules/Services/api/RegisteredTrainers/GetOfferedTrainers?LocationID=0&Distance=25&IsExplicit=false&CourseId=1']
def parse_start_url(self, response):
x = XmlXPathSelector(XmlResponse)
Latitude = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/Latitude').extract()
Longitude = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/Longitude').extract()
RTOCode = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/RTOCode').extract()
SiteName = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/SiteName').extract()
有人能告诉我我走的路是否正确吗?你能发布完整的回溯吗?body\u as\u unicode
part在哪里?下面是完整的回溯: