Python 访问使用ElementTree解析的xml文件中的嵌套子级_Python_Xml_Tree_Xml Parsing_Elementtree

Python 访问使用ElementTree解析的xml文件中的嵌套子级

python xml tree

Python 访问使用ElementTree解析的xml文件中的嵌套子级,python,xml,tree,xml-parsing,elementtree,Python,Xml,Tree,Xml Parsing,Elementtree,我不熟悉xml解析。具有以下树： FHRSEstablishment |--> Header | |--> ... |--> EstablishmentCollection | |--> EstablishmentDetail | | |-->... | |--> Scores | | |-->... |--> EstablishmentCollection | |--> Es

我不熟悉xml解析。具有以下树：

FHRSEstablishment
 |--> Header
 |    |--> ...
 |--> EstablishmentCollection
 |    |--> EstablishmentDetail
 |    |    |-->...
 |    |--> Scores
 |    |    |-->...
 |--> EstablishmentCollection
 |    |--> EstablishmentDetail
 |    |    |-->...
 |    |--> Scores
 |    |    |-->...

但是当我使用ElementTree访问它并查找

子标记和属性时
import xml.etree.ElementTree as ET
import urllib2
tree = ET.parse(
   file=urllib2.urlopen('http://ratings.food.gov.uk/OpenDataFiles/FHRS408en-GB.xml' % i))
root = tree.getroot()
for child in root:
   print child.tag, child.attrib

我只得到：
Header {}
EstablishmentCollection {}

我假设这意味着它们的属性是空的。为什么会这样？我如何访问嵌套在机构详细信息
和分数
中的子项
编辑
由于下面的答案，我可以进入树中，但如果我想检索分数中的值，这将失败：
for node in root.find('.//EstablishmentDetail/Scores'):
    rating = node.attrib.get('Hygiene')
    print rating 

生产
None
None
None

这是为什么呢？
你必须在你的根上施加压力
那就是root.iter（）
就可以了
import xml.etree.ElementTree as ET
import urllib2
tree =ET.parse(urllib2.urlopen('http://ratings.food.gov.uk/OpenDataFiles/FHRS408en-GB.xml'))
root = tree.getroot()
for child in root.iter():
   print child.tag, child.attrib

输出：
FHRSEstablishment {}
Header {}
ExtractDate {}
ItemCount {}
ReturnCode {}
EstablishmentCollection {}
EstablishmentDetail {}
FHRSID {}
LocalAuthorityBusinessID {}
...

FHRSID {}
LocalAuthorityBusinessID {}
BusinessName {}
BusinessType {}
BusinessTypeID {}
RatingValue {}
RatingKey {}
RatingDate {}
LocalAuthorityCode {}
LocalAuthorityName {}
LocalAuthorityWebSite {}
LocalAuthorityEmailAddress {}
Scores {}
SchemeType {}
NewRatingPending {}
Geocode {}


要在EstablishmentDetail
中获取所有标记，您需要找到该标记，然后遍历其子项

就是比如说
for child in root.find('.//EstablishmentDetail'):
    print child.tag, child.attrib

输出：
FHRSEstablishment {}
Header {}
ExtractDate {}
ItemCount {}
ReturnCode {}
EstablishmentCollection {}
EstablishmentDetail {}
FHRSID {}
LocalAuthorityBusinessID {}
...

FHRSID {}
LocalAuthorityBusinessID {}
BusinessName {}
BusinessType {}
BusinessTypeID {}
RatingValue {}
RatingKey {}
RatingDate {}
LocalAuthorityCode {}
LocalAuthorityName {}
LocalAuthorityWebSite {}
LocalAuthorityEmailAddress {}
Scores {}
SchemeType {}
NewRatingPending {}
Geocode {}


要获得您在评论中提到的卫生
得分

您所做的是，它将获得第一个Scores
标记，并且当您在root.find（'.//Scores'）：rating=child.get（'Hygiene'）中调用时，它将具有Hygiene、ConfidenceInManagement、结构化标记作为子标记。也就是说，显然所有三个子元素都没有元素
你需要先
-查找所有分数标记。
-在找到的每个标签中查找卫生

for each in root.findall('.//Scores'):
    rating = each.find('.//Hygiene')
    print '' if rating is None else rating.text

输出：
5
5
5
0
5

希望它能有用：
import xml.etree.ElementTree as etree
with open('filename.xml') as tmpfile:
    doc = etree.iterparse(tmpfile, events=("start", "end"))
    doc = iter(doc)
    event, root = doc.next()
    num = 0
    for event, elem in doc:
        print event, elem

哇，这很好，但我仍然很难得到最终的价值观，比如分数。如果我在root.find（'.//Scores'）中对child执行：rating=child.get（'healthy'）；印刷品评级结果是无
。我该怎么办？这是一个正则表达式吗？event，root=doc.next（）
AttributeError:'IterParseIterator'对象没有属性'next'
我的脚本在python2上工作，对于python3使用：event，root=doc.\uuuuuuuuu next\uuuuu（）