使用xml迷你dom python获取元素文本
我正在尝试使用mini-dom获取元素的文本,在下面的代码中,我也尝试了建议的getText方法,但无法获得所需的输出,下面是我的代码。我无法从我尝试处理的元素中获取文本值使用xml迷你dom python获取元素文本,python,xml,python-3.x,minidom,Python,Xml,Python 3.x,Minidom,我正在尝试使用mini-dom获取元素的文本,在下面的代码中,我也尝试了建议的getText方法,但无法获得所需的输出,下面是我的代码。我无法从我尝试处理的元素中获取文本值 import xml.dom.minidom doc = xml.dom.minidom.parse("DL_INVOICE_DETAIL_TCB.xml") results = doc.getElementsByTagName("G_TRANSACTIONS") def getText(nodelist): rc
import xml.dom.minidom
doc = xml.dom.minidom.parse("DL_INVOICE_DETAIL_TCB.xml")
results = doc.getElementsByTagName("G_TRANSACTIONS")
def getText(nodelist):
rc = []
for node in nodelist:
if node.nodeType == node.TEXT_NODE:
rc.append(node.data)
return ''.join(rc)
for result in results:
for element in result.getElementsByTagName("INVOICE_NUMBER"):
print(element.nodeType)
print(element.nodeValue)
下面是我的XML示例
<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>
我正在使用以下如果您可以使用,下面是代码:
import xml.etree.ElementTree as ET
xml = '''<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
</G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31006</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>'''
root = ET.fromstring(xml)
invoice_numbers = [entry.text for entry in list(root.findall('.//INVOICE_NUMBER'))]
print(invoice_numbers)
基于minidom的答案
from xml.dom import minidom
xml = """\
<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
</G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31006</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>"""
dom = minidom.parseString(xml)
invoice_numbers = [int(x.firstChild.data) for x in dom.getElementsByTagName("INVOICE_NUMBER")]
print(invoice_numbers)
谢谢你的回复,但是我需要通过mini-Dom来告诉你,ElementTree是一个核心python包,而不是像mini-Dom那样的外部包
from xml.dom import minidom
xml = """\
<LIST_G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31002</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice1</TRANSACTION_CLASS>
</G_TRANSACTIONS>
<G_TRANSACTIONS>
<INVOICE_NUMBER>31006</INVOICE_NUMBER>
<TRANSACTION_CLASS>Invoice2</TRANSACTION_CLASS>
</G_TRANSACTIONS>
</LIST_G_TRANSACTIONS>"""
dom = minidom.parseString(xml)
invoice_numbers = [int(x.firstChild.data) for x in dom.getElementsByTagName("INVOICE_NUMBER")]
print(invoice_numbers)
[31002, 31006]