如何在python中查找xml中的特定节点，同时检查其树结构？_Python_Python 3.x_Xml_Xml Parsing

如何在python中查找xml中的特定节点，同时检查其树结构？

python python-3.x xml

如何在python中查找xml中的特定节点，同时检查其树结构？,python,python-3.x,xml,xml-parsing,Python,Python 3.x,Xml,Xml Parsing,我的xml如下所示。 <?xml version="1.0" encoding="UTF-8"?> <ServiceResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://xx.xx.xx/xx/xx/x.x/xx/xx.xsd"> <r

我的xml如下所示。

<?xml version="1.0" encoding="UTF-8"?>
<ServiceResponse xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="https://xx.xx.xx/xx/xx/x.x/xx/xx.xsd">
  <responseCode>SUCCESS</responseCode>
  <count>100</count>
  <hasMoreRecords>true</hasMoreRecords>
  <lastId>12345</lastId>
  <data>
    <Main>
      <sub1>1</id>
      <sub2>a</name>
    </Main>
    <Main>
      <sub1>2</id>
      <sub2>b</name>
    </Main>
  </data>
</ServiceResponse>

import csv
import xml.etree.ElementTree as etree
    
xml_file_name = 'blah.xml'
csv_file_name = 'blah.csv'
main_tag_name = 'Main'
fields = ['sub1', 'sub2']

tree = etree.parse(xml_file_name)

with open(csv_file_name, 'w', newline='', encoding="utf-8") as csv_file:
    csvwriter = csv.writer(csv_file)
    csvwriter.writerow(fields)
    for host in tree.iter(tag=main_tag_name):
        data = []
        for field in fields:
            if host.find(field) is not None:
                data.append(host.find(field).text)
            else:
                data.append('')
        csvwriter.writerow(data)

不知何故，我认为这不是解析xml的正确方法，因为它在树结构中的任何位置搜索“Main”，并且没有按照特定的路径进行搜索。意思-如果它意外地在其他任何地方找到“Main”，程序将无法按预期工作

请您向我推荐您所知道的针对这个用例的最优化的方法，主要是一种内置的方法，而不是太多的定制

注：

我想将其用作多个xml文件的通用脚本，这些文件在到达主标记之前具有各种标记，然后具有各种子标记。需要考虑这一点，以确保我们不会硬编码树结构，并且是可配置的。

您可以尝试基于xpath的方法

例如：

with open('some.csv', 'w', newline='') as f:
    writer = csv.writer(f)
    with open("test.xml") as f:
        tree = ET.parse(f)
        root = tree.getroot()
        sub1_nodes = root.findall('.//data/Main/sub1')
        sub2_nodes = root.findall('.//data/Main/sub2')
        for a,b in zip(sub1_nodes, sub2_nodes):
            writer.writerow([a.text, b.text])

谢谢，我试试这个。另一方面，如果有多个标记，如sub1、sub2等，这将导致多次调用root.findall（）。另外，如果说xml像main sub1 sub2、main sub1、main sub2，那么第二行的sub1和第三行的sub2将合并到zip中。因此，我试图找到一些东西，在那里我依次到达Main，然后分别获得每行的所有可用sub。好的。我只是想展示一个基于

xpath

的方法的示例