Python 在属性ID相同的位置合并XML文件_Python_Xml_Python 2.7_Merge_Elementtree

Python 在属性ID相同的位置合并XML文件

python xml python-2.7 merge

Python 在属性ID相同的位置合并XML文件,python,xml,python-2.7,merge,elementtree,Python,Xml,Python 2.7,Merge,Elementtree,我有两个要合并的XML文件 XML1：或者，如果标记被替换，并且我想要相同的输出，但在一个根节点下，即diseaseAttributes，我如何实现这一点？您的第一个XML文件缺少下的关闭标记。它们的结构也非常糟糕——冗长可笑，名字也很混乱，所以我真的不知道你想做什么第一个文件看起来好像只是表达了一个“属性”关系树。这是第二个我不明白的——它似乎包含了一个属性“Age”的定义和名称，它是什么类型的数据，但它是下面“Cancer”的一部分。为什么？我的猜测是，你将显示按年龄细分的结果，但为什么

我有两个要合并的XML文件

XML1：

或者，如果标记被替换，并且我想要相同的输出，但在一个根节点下，即diseaseAttributes，我如何实现这一点？

您的第一个XML文件缺少

下的关闭

标记。它们的结构也非常糟糕——冗长可笑，名字也很混乱，所以我真的不知道你想做什么

第一个文件看起来好像只是表达了一个“属性”关系树。这是第二个我不明白的——它似乎包含了一个属性“Age”的定义和名称，它是什么类型的数据，但它是下面“Cancer”的一部分。为什么？我的猜测是，你将显示按年龄细分的结果，但为什么年龄与癌症有关？如果您有年龄数据，例如冬季流感死亡，会发生什么情况，这是否有其独特的年龄属性

事实上，我的第一个问题。。。XML2应该是这样工作的：


巨蟹座
年龄
年龄（年）
双重的
10
0
10
10 - 20
10
20
流行性感冒
年龄
年龄（年）
双重的
儿童
0
18
成人
18
60
老年人
60

因为这似乎就是你所暗示的，即使我做得稍微小一点，也很可怕。我也不确定层次结构的信息是如何融入其中的

属性及其层次结构是否仅用于显示数据？即使如此，这似乎更好


人口统计学的
流行病学
年龄
年数
临床的
0
性别
男性
女性

然后


巨蟹座
死亡年龄
真的
最多10个
青少年及；年轻人
成人
老年人
我认为您要做的最好是使用安装lxml
模块
pip安装lxml

并将其用于任何与XML相关的代码，因为它比内置的东西好得多。看看教程，有很多方法可以在一个过程中加载、解析和处理每个文件中的属性元素
网站上有更多有用的信息
我刚刚发布了部分XML文件。因此，请忽略任何不匹配的标记。最后，我希望在属性相同的情况下合并两个文件，一般来说，我不想将其编码为“年龄”或“癌症”。我希望它对每个属性和根都有效。如何合并？它们只有1，在两者的某个地方，但是您甚至没有说哪个XML文档是您要合并的，哪个XML文档是要合并的，而您只是在输出中添加了两个根元素，只是为了让它更错误、更混乱。这个过程不是-您的“输出”两者兼而有之，只会让人更加困惑。
<hierachyAttributes>
    <attribute>
        <displayOrder>2</displayOrder>
        <attributeID>Demographics</attributeID>
        <children>
            <attribute>
                <displayOrder>1</displayOrder>
                <attributeID>age</attributeID>
        </children>
    </attribute>
</hierachyAttributes>

<diseaseAttributes>
    <diseaseName>Cancer</diseaseName>
    <diseaseID>1322843</diseaseID>
    <metaAttributes>
        <attribute>
            <description>Age</description>
            <displayName>Age (years)</displayName>
            <attributeID>age</attributeID>
            <type>Double</type>
            <attributeCategory>Clinical</attributeCategory>
            <displayInSummary>TRUE</displayInSummary>
                <group>
                    <displayOrder>1</displayOrder>
                    <displayName>0 - &lt; 10</displayName>
                    <minValue>0</minValue>
                    <minInclusive>TRUE</minInclusive>
                    <maxValue>10</maxValue>
                    <maxInclusive>FALSE</maxInclusive>
                </group>
            </valueGroups>
        </attribute>
    </metaAttributes>
</diseaseAttributes>

<hierachyAttributes>
<diseaseAttributes>
    <diseaseName>Cancer</diseaseName>
    <diseaseID>1322843</diseaseID>
    <metaAttributes>
        <attribute>
        <displayOrder>2</displayOrder>
        <attributeID>Demographics</attributeID>
        <children>
            <attribute>
                <displayOrder>1</displayOrder>
                <attributeID>age</attributeID>
                <description>Age</description>
                <displayName>Age (years)</displayName>
                <type>Double</type>
                <attributeCategory>Clinical</attributeCategory>
                <displayInSummary>TRUE</displayInSummary>
                    <group>
                        <displayOrder>1</displayOrder>
                        <displayName>0 - &lt; 10</displayName>
                        <minValue>0</minValue>
                        <minInclusive>TRUE</minInclusive>
                        <maxValue>10</maxValue>
                        <maxInclusive>FALSE</maxInclusive>
                    </group>
                </valueGroups>
            </attribute>
        </children>
    </metaAttributes>
</diseaseAttributes>
</hierachyAttributes>

#!/usr/bin/env python
import sys
from xml.etree import ElementTree

def run(files):
    first = None
    for filename in files:
        data = ElementTree.parse(filename).getroot()
        if first is None:
            first = data
        else:
            first.extend(data)
    if first is not None:
        print ElementTree.tostring(first)

if __name__ == "__main__":
    run(sys.argv[1:])