如何使用python在xml文件中的XPATH中搜索部分子字符串或正则表达式_Python_Xml_Xpath

如何使用python在xml文件中的XPATH中搜索部分子字符串或正则表达式

python xml xpath

如何使用python在xml文件中的XPATH中搜索部分子字符串或正则表达式,python,xml,xpath,Python,Xml,Xpath,我试图在xml文件内容中搜索正则表达式模式，并查找如何传递始终以数字结尾的子字符串的问题（这是xml文件中动态的部分，所以不知道如何创建模式和搜索）一旦找到模式，我就需要得到它的子标记项，即attrib和text值 xml文件内容： <author NAME="PYTHON_DD101"> <type>BOOK</type> <ID>59</ID>

我试图在xml文件内容中搜索正则表达式模式，并查找如何传递始终以数字结尾的子字符串的问题（这是xml文件中动态的部分，所以不知道如何创建模式和搜索）

一旦找到模式，我就需要得到它的子标记项，即attrib和text值

xml文件内容：

         <author NAME="PYTHON_DD101">
             <type>BOOK</type>
             <ID>59</ID>
             <inst ID="A">Garry</inst>
             <inst ID="B">Gerald</inst>
         </author>
         <author NAME="PYTHON_ABC4">
             <type>BOOK</type>
             <SrcID>62</SrcID>
             <inst ID="A">Niel</inst>
             <inst ID="B">Long</inst>
         </author>

获取时出错：

PYTHON_ABC
<_sre.SRE_Pattern object at 0x000000000403F168>
<class 'xml.etree.ElementTree.Element'>
<type 'NoneType'>
None
Traceback (most recent call last):
  File "C:\test_Book.py", line 45, in <module>
    bookauthor = book.get_Book_by_author(Book)
  File "C:\Book.py", line 219, in get_Book_by_author
    for FoundDetails in FoundDetails.iterfind('author'):
AttributeError: 'NoneType' object has no attribute 'iterfind'

如果我在下面的一行中传递确切的名称值，即“PYTHON_ABC4”，它会起作用，但我不想传递硬代码值，因为文件中可能还有其他实例具有相同模式的名称，例如：“PYTHON_ABC12”，在这种情况下，我也想获得这些书籍的详细信息

FoundDetails = Content.find(".//author[@NAME='{}']".format("PYTHON_ABC4"))

我对你的代码进行了一点修改，以获得所需的输出，希望对你有所帮助

data='''
<PARAMETER-VALUES>
<author NAME="PYTHON_DD11">
             <type>BOOK</type>
             <ID>59</ID>
             <inst ID="A">Garry</inst>
             <inst ID="B">Gerald</inst>
         </author>
         <author NAME="PYTHON_ABC4">
             <type>BOOK</type>
             <SrcID>62</SrcID>
             <inst ID="A">Niel</inst>
             <inst ID="B">Long</inst>
         </author>
</PARAMETER-VALUES>
'''




#Element tree to parse the xml data

import xml.etree.ElementTree as ET
import re
root=ET.fromstring(data)

# A function to verify if the node is alphanumeric

def hasnumbers(result):
    return bool(re.search(r'\d', result))

for author in root.iter('author'):
    result=author.attrib.get('NAME')
    b=hasnumbers(result)
    if b==True:
        for inst in author.iterfind('inst'):
            print 'inst id:',inst.attrib.get('ID'),'inst name:',inst.text

完美的谢谢潘卡吉。调用“hasnumbers”子函数的好方法。今天我学到了新的想法。再次感谢。

FoundDetails = Content.find(".//author[@NAME='{}']".format("PYTHON_ABC4"))

data='''
<PARAMETER-VALUES>
<author NAME="PYTHON_DD11">
             <type>BOOK</type>
             <ID>59</ID>
             <inst ID="A">Garry</inst>
             <inst ID="B">Gerald</inst>
         </author>
         <author NAME="PYTHON_ABC4">
             <type>BOOK</type>
             <SrcID>62</SrcID>
             <inst ID="A">Niel</inst>
             <inst ID="B">Long</inst>
         </author>
</PARAMETER-VALUES>
'''




#Element tree to parse the xml data

import xml.etree.ElementTree as ET
import re
root=ET.fromstring(data)

# A function to verify if the node is alphanumeric

def hasnumbers(result):
    return bool(re.search(r'\d', result))

for author in root.iter('author'):
    result=author.attrib.get('NAME')
    b=hasnumbers(result)
    if b==True:
        for inst in author.iterfind('inst'):
            print 'inst id:',inst.attrib.get('ID'),'inst name:',inst.text

inst id: A inst name: Garry
inst id: B inst name: Gerald
inst id: A inst name: Niel
inst id: B inst name: Long