用Python中的嵌套标记解析XML数据_Python_Xml

用Python中的嵌套标记解析XML数据

python xml

用Python中的嵌套标记解析XML数据,python,xml,Python,Xml,我正在尝试使用ElementTree以如下所示的格式解析XML数据： <dataset> <title>Birds of Kafiristan</title> <creator> <individualName> <givenName>James</givenName> <surName>Brooke</surName>

我正在尝试使用ElementTree以如下所示的格式解析XML数据：

<dataset>
<title>Birds of Kafiristan</title>
    <creator>
        <individualName>
            <givenName>James</givenName>
            <surName>Brooke</surName>
        </individualName>
    </creator>
    <creator>
        <organizationName>Bird Conservation Alliance</organizationName>
        <address>
            <deliveryPoint>P.O. Box 999</deliveryPoint>
            <deliveryPoint>Mailstop 1234</deliveryPoint>
            <city>Washington</city>
            <administrativeArea>DC</administrativeArea>
            <postalCode>9999</postalCode>
            <country>USA</country>
        </address>
        <phone phonetype="voice">999-999-9999 x 123</phone>
        <phone phonetype="fax">999-999-9999</phone>
        <electronicMailAddress>contact@birds.org</electronicMailAddress>
        <onlineUrl>http://www.birds.org/</onlineUrl>
    </creator>
    <contact>
        <individualName>
            <givenName>Josiah</givenName>
            <surName>Harlan</surName>
        </individualName>
    </contact>
    <pubDate>2010</pubDate>
    <abstract>
         <para>This dataset contains the results of a bird survey from Kafiristan</para>
    </abstract>
    <keywordSet>
         <keyword>birds</keyword>
         <keyword>biodiversity</keyword>
         <keyword>animal ecology</keyword>
    </keywordSet>
    <distribution>
        <online>
           <url>http://birds.org/datasets</url>
        </online>
   </distribution>
</dataset>

根据上面的代码片段，我得到了一些元素的值，但不是全部——事实上，我无法得到嵌套元素的值（如“givenName”和“姓氏”，它们嵌套在“individualName”中，而“individualName”又嵌套在“creator”中）

有什么提示吗

与往常一样，提前感谢您提供的任何帮助1

这里似乎有一个defaultdict可能很有用：

d = collections.defaultdict(list)
for element in rootElement.iter():
    d[element.tag].append(element.text)

这将为您提供标记的映射—与每个标记关联的“文本”列表（xml中每个标记元素对应一项）。

您想知道与文档中所有标记关联的文本吗？或者你在看特定的标签？在后一种情况下，

Element.find

可能有用，也可能有用……这非常有用！为什么ElementTree文档中没有提到字典的这种伟大技巧？

d = collections.defaultdict(list)
for element in rootElement.iter():
    d[element.tag].append(element.text)