使用Python3实现灵活的XML到字典_Python_Xml_Dictionary_Generator Expression_Yield From

使用Python3实现灵活的XML到字典

python xml dictionary

使用Python3实现灵活的XML到字典,python,xml,dictionary,generator-expression,yield-from,Python,Xml,Dictionary,Generator Expression,Yield From,我得到了一个包含我需要的数据的XML文件，我需要将其转换为CSV 这应该很简单，但是XML的“重复单元”的子元素数量并不总是相同的我试图解决的是如何最好地迭代每个子元素的子元素，直到没有更多的子元素，并将其作为一行返回。最终输出应该是一个字典列表（CSV的每行一个列表）例如 <repeatingunit> <city> <name>London</name

我得到了一个包含我需要的数据的XML文件，我需要将其转换为CSV

这应该很简单，但是XML的“重复单元”的子元素数量并不总是相同的

我试图解决的是如何最好地迭代每个子元素的子元素，直到没有更多的子元素，并将其作为一行返回。最终输出应该是一个字典列表（CSV的每行一个列表）

例如

            <repeatingunit>
                <city>
                    <name>London</name>
                </city>
                <station>
                    <name>Southwark</name>
                    <tubeline>
                        <name>Jubilee</name>
                    </tubeline>
            </repeatingunit>
            <repeatingunit>
                <city>
                    <name>London</name>
                    <county>UK</county>
                <station>
                    <name>Mile End</name>
                </station>
            </repeatingunit>

我一直在使用xml.etree.ElementTree和root.iter，我对循环很满意，但这是动态的

我尝试使用多嵌套列表的逻辑，但没有用。有人能给我指出正确的方向并提出一种新的方法吗

我知道最后不同长度的字典不适合写入csv，但我可以根据所需的输出处理它。

递归解决方案如何

def build_key(elem, key, result):
    key = key + '|' + elem.name
    if not elem.children:
        result[key] = elem.text

    else:
        for child in elem.children:
            build_key(child, key, result)

results = []
for unit in soup.find_all('repeatingunit'):
    result = {}
    for child in unit.children:
        build_key(child, '', result)

递归解决方案怎么样

def build_key(elem, key, result):
    key = key + '|' + elem.name
    if not elem.children:
        result[key] = elem.text

    else:
        for child in elem.children:
            build_key(child, key, result)

results = []
for unit in soup.find_all('repeatingunit'):
    result = {}
    for child in unit.children:
        build_key(child, '', result)