Python 使用lxml解析具有嵌套名称空间的XML属性
我想解析RankByCountry中的国家代码属性。我怎么做 方法-打印列表['GB','US','O']Python 使用lxml解析具有嵌套名称空间的XML属性,python,xml,xml-parsing,lxml,alexa,Python,Xml,Xml Parsing,Lxml,Alexa,我想解析RankByCountry中的国家代码属性。我怎么做 方法-打印列表['GB','US','O'] <aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11"><aws:OperationRequest><aws:Reque
<aws:UrlInfoResponse xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><aws:Response xmlns:aws="http://awis.amazonaws.com/doc/2005-07-11"><aws:OperationRequest><aws:RequestId>122bfdc6-ae8e-d2a2-580e-3841ab33b966</aws:RequestId></aws:OperationRequest><aws:UrlInfoResult><aws:Alexa>
<aws:TrafficData>
<aws:DataUrl type="canonical">androidjones.com/</aws:DataUrl>
<aws:RankByCountry>
<aws:Country Code="GB">
<aws:Rank>80725</aws:Rank>
<aws:Contribution>
<aws:PageViews>30.6%</aws:PageViews>
<aws:Users>41.3%</aws:Users>
</aws:Contribution>
</aws:Country>
<aws:Country Code="US">
<aws:Rank>354356</aws:Rank>
<aws:Contribution>
<aws:PageViews>39.1%</aws:PageViews>
<aws:Users>28.9%</aws:Users>
</aws:Contribution>
</aws:Country>
<aws:Country Code="O">
<aws:Rank/>
<aws:Contribution>
<aws:PageViews>30.2%</aws:PageViews>
<aws:Users>29.8%</aws:Users>
</aws:Contribution>
</aws:Country>
</aws:RankByCountry>
</aws:TrafficData>
</aws:Alexa></aws:UrlInfoResult><aws:ResponseStatus xmlns:aws="http://alexa.amazonaws.com/doc/2005-10-05/"><aws:StatusCode>Success</aws:StatusCode></aws:ResponseStatus></aws:Response></aws:UrlInfoResponse>
但是没有运气
而且:
for country in tree.xpath('//Country'):
for attrib in country.attrib:
print '@' + attrib + '=' + country.attrib[attrib]
文档看起来很奇怪,因为它使用了两次
aws
名称空间前缀。您需要使用更具体的名称空间,因为这会用前缀aws
覆盖全局名称空间。事实上你做得对
问题在于xpath表达式本身,它应该如下所示:
for country in tree.xpath('//aws:RankByCountry/aws:Country/@Code', namespaces=namespaces):
print(country)
请注意,
没有code
属性,但是
有。您尝试了什么吗?是的,但不知道如何访问属性,只是数据…@KhalilAmmour-للمو-我编辑了这个问题。我的错,我以为您是指使用regex
!!谢谢@hek2mgl。您可以将root
更改为tree
,以与我的代码保持一致。
for country in tree.xpath('//aws:RankByCountry/aws:Country/@Code', namespaces=namespaces):
print(country)