Python 查找元素和打印值_Python_Xml_Xml Parsing

Python 查找元素和打印值

python xml

Python 查找元素和打印值,python,xml,xml-parsing,Python,Xml,Xml Parsing,我想解析嵌套元素。我不介意使用或。例如，我想打印的一些值位于： >>> root[0][0][0][0][0].tag '{http://www.domain.com/somepath/Schema}element' >>> root[0][0][0][0][0].text 'findme' 迭代XML文档、解析和打印元素值的理想方法是什么？下面是我正在使用的模式的一个示例 <?xml version="1.0" encoding="UTF-8"?&g

我想解析嵌套元素。我不介意使用或。例如，我想打印的一些值位于：

>>> root[0][0][0][0][0].tag
'{http://www.domain.com/somepath/Schema}element'
>>> root[0][0][0][0][0].text
'findme'

迭代XML文档、解析和打印

元素

值的理想方法是什么？下面是我正在使用的模式的一个示例

<?xml version="1.0" encoding="UTF-8"?>
<data xsi:schemaLocation="http://www.domain.com/somepath/Schema file.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.domain.com/somepath/Schema">
    <one stuff0="" stuff1="">
        <two stuff0="" stuff1="">
            <three>
                <four stuff0="234234" stuff1="234324">
                    <element>findme</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme2</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme3</element>
                </four>
            </three>
        </two>  
    </one>
    <one stuff0="" stuff1="">
        <two stuff0="" stuff1="">
            <three>
                <four stuff0="234234" stuff1="234324">
                    <element>findme4</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme5</element>
                </four>
                <four stuff0="234234" stuff1="234324">
                    <element>findme6</element>
                </four>
            </three>
        </two>  
    </one>
</data>

鉴于此，我还尝试了以下几点，但均未成功：

>>> for elem in doc.findall('one/two/three/four'):
...     print value.get('stuff1'), elem.text
...
>>>

发现问题：

由于我在阅读后了解到缺少名称空间规范，所以没有读取元素。因此，以下示例有效：

>>> import xml.etree.cElementTree as ET
>>> for event, element in ET.iterparse("schema.xml"):
...     if element.tag == "{http://www.domain.com/somepath/Schema}element":
...        print element.text
...
findme
findme2
findme3
findme4
findme5
findme6

如果没有看到您的XML文档，我不能确定，但我认为您要做的是：

test.xml

<?xml version="1.0"?>
<root>
  <group>
    <element>This is the first text</element>
  </group>
  <group>
    <element>This is the second text</element>
  </group>
  <group>
    <element>This is the third text</element>
  </group>
</root>

在终端中运行这些文件，我得到：

mike@tester:~$ python test.py
This is the first text
This is the second text
This is the third text

您链接到的任何一个库都可以。我建议您使用

elementtree

模块。它是编译的C代码，因此运行速度稍快，占用的内存也较少，但它的接口与

elementtree

非常相似。这就是我要找的，尽管它没有打印输出。我已经用模式的一个例子更新了这个问题。我的问题是名称空间的使用（在问题中更新）。在您的示例中，如何处理命名空间？要处理命名空间，请使用

“{namespace uri}element”

而不是

“element”

。或者，

ET.QName（“名称空间uri”、“元素”）

。这就成功了，问题更新为您的答案/名称空间用法。谢谢

import xml.etree.cElementTree as ET

for event, element in ET.iterparse("test.xml"):
    if element.tag == "element":
       print element.text

mike@tester:~$ python test.py
This is the first text
This is the second text
This is the third text