Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/303.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 粗糙的XML文档_Python_Scrapy - Fatal编程技术网

Python 粗糙的XML文档

Python 粗糙的XML文档,python,scrapy,Python,Scrapy,我刚刚接触Python和scrapy。我试图从多个XML文档中提取数据。我在这里找到了XML,它们的范围从CourseId=1一直到CourseId=4500: 例1: 例1: 我的代码如下。当我运行它时,我得到一个类型错误:必须以XMLResonse实例作为第一个参数调用unbound方法body_as_unicode() from scrapy.contrib.spiders import CrawlSpider,Rule from scrapy.contrib.linkextracto

我刚刚接触Python和
scrapy
。我试图从多个XML文档中提取数据。我在这里找到了XML,它们的范围从
CourseId=1
一直到
CourseId=4500

例1:

例1:

我的代码如下。当我运行它时,我得到一个
类型错误:必须以XMLResonse实例作为第一个参数调用unbound方法body_as_unicode()

from scrapy.contrib.spiders import CrawlSpider,Rule
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
from scrapy.selector import XmlXPathSelector
from myProject.items import RTOData
from scrapy.http import Request
from scrapy.http import XmlResponse


class myProjectSpider(CrawlSpider):
    name = 'myProject'
    allowed_domains = ['myskills.gov.au']
    start_urls =     ['http://www.myskills.gov.au/DesktopModules/Services/api/RegisteredTrainers/GetOfferedTrainers?LocationID=0&Distance=25&IsExplicit=false&CourseId=1']

    def parse_start_url(self, response):
     x = XmlXPathSelector(XmlResponse)
     Latitude = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/Latitude').extract()
     Longitude = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/Longitude').extract()
     RTOCode = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/RTOCode').extract()
     SiteName = x.select('//ArrayOfRegisteredTrainerLocationOfferedItem/RegisteredTrainerLocationOfferedItem/SiteName').extract()

有人能告诉我我走的路是否正确吗?

你能发布完整的回溯吗?
body\u as\u unicode
part在哪里?下面是完整的回溯: