Python 带“的刮擦错误”；“视图”；和OAI-PMH_Python_Xml_Xpath_Scrapy_Oai

Python 带“的刮擦错误”；“视图”；和OAI-PMH

python xml xpath scrapy

Python 带“的刮擦错误”；“视图”；和OAI-PMH,python,xml,xpath,scrapy,oai,Python,Xml,Xpath,Scrapy,Oai,我正在使用scrapy从一个使用。我对OAI-PMH如何影响Scrapy并不十分熟悉，但在使用以下命令时似乎出现了问题： scrapy view http://fukushima.archive-disasters.jp/infolib/oai_repository/repository?verb=ListRecords&metadataPrefix=ndlkn 此时将打开一个记事本文件，其中包含以下内容，而不是在我的浏览器中打开网站： <?xml version="1.

我正在使用scrapy从一个使用。我对OAI-PMH如何影响Scrapy并不十分熟悉，但在使用以下命令时似乎出现了问题：

scrapy view http://fukushima.archive-disasters.jp/infolib/oai_repository/repository?verb=ListRecords&metadataPrefix=ndlkn

此时将打开一个记事本文件，其中包含以下内容，而不是在我的浏览器中打开网站：

    <?xml version="1.0" encoding="UTF-8" ?>
    <OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd"> <responseDate>
    2013-12-30T00:11:45Z</responseDate>
    <request>http://fukushima.archive-disasters.jp/infolib/oai_repository/repository</request>
    <error code="badArgument">It is an inaccurate parameter.</error>

XML中唯一出现的

metadataPrefix

时间是在第3行：

<request metadataPrefix="ndlkn" verb="ListRecords">

典型的刮擦输出，然后：

    >>> sel.remove_namespaces()
    >>> sel.xpath('//record')
    []
    >>> sel.xpath('//OAI-PMH')
    [<Selector xpath='//OAI-PMH' data=u'<OAI-PMH xmlns="http://www.openarchives.'>]
    >>> sel.xpath('//OAI-PMH/request')
    [<Selector xpath='//OAI-PMH/request' data=u'<request xmlns="http://www.openarchives.'>]
    >>> sel.xpath('//OAI-PMH/ListRecords')
    []

>>选择删除名称空间（）
>>>sel.xpath（“//记录”）
[]
>>>sel.xpath（'//OAI-PMH'）
[]
>>>sel.xpath（'//OAI-PMH/request'）
[]
>>>sel.xpath（“//OAI-PMH/ListRecords”）
[]

我需要使用什么XPath

很抱歉问了这么长的问题。我只是担心这两个问题是相互关联的，OAI-PMH在这里造成了问题。请让我知道我是否应该打破这个或任何其他方式，我可以让它更清楚

编辑：我觉得自己超级笨，但我意识到了问题所在。因为URL中有一个

，所以在调用

scrapy view

或

scrapy shell

后，它需要被引用或转义。这解决了我的两个问题！希望这对将来的任何人都有帮助

我意识到了我的错误。因为URL中有一个

，所以它需要在

scrapy view

或

scrapy shell

之后加引号或转义。这解决了我的两个问题！希望这对以后的任何人都有帮助。

你可以发布自己的答案并接受它，这样这个问题看起来就解决了。是的，我昨晚试过了，但因为我的名声不到10岁，我不得不等8个小时左右。所以现在该结束了，但我要到明天才能接受，哈哈。

scrapy shell http://fukushima.archive-disasters.jp/infolib/oai_repository/repository?verb=ListRecords&metadataPrefix=ndlkn

    >>> sel.remove_namespaces()
    >>> sel.xpath('//record')
    []
    >>> sel.xpath('//OAI-PMH')
    [<Selector xpath='//OAI-PMH' data=u'<OAI-PMH xmlns="http://www.openarchives.'>]
    >>> sel.xpath('//OAI-PMH/request')
    [<Selector xpath='//OAI-PMH/request' data=u'<request xmlns="http://www.openarchives.'>]
    >>> sel.xpath('//OAI-PMH/ListRecords')
    []