使用Python中的条件更新XML节点值

使用Python中的条件更新XML节点值,python,xml,Python,Xml,在下面的XML中,我想解析它,并将“PolicyId”的值更新为一些随机值,例如“POL111112NGJ”和“TransactionDate”,仅当它满足条件PolicyId==pol00002ngj时,才更新为当前日期和时间。下面我给出的代码更新了“PolicyId和TransactionDate”的所有值,我只想在条件为TRUE时更新该值。根据给定的XML,我希望将前3个集合更新为相同的“PolicyId和TransactionDate”值。第四个设置为不同的值 我尝试添加条件~~如果RO

在下面的XML中,我想解析它,并将“PolicyId”的值更新为一些随机值,例如“POL111112NGJ”和“TransactionDate”,仅当它满足条件PolicyId==pol00002ngj时,才更新为当前日期和时间。下面我给出的代码更新了“PolicyId和TransactionDate”的所有值,我只想在条件为TRUE时更新该值。根据给定的XML,我希望将前3个集合更新为相同的“PolicyId和TransactionDate”值。第四个设置为不同的值

我尝试添加条件~~如果ROW.attrib['PolicyId']='POL000002NGJ':~~我得到“keyrerror:'PolicyId'”

有人能帮我弄明白怎么处理吗

--BEFORE UPDATING---

<TABLE>
   <ROW>
      <PolicyId>POL000002NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL000002NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL111111NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
</TABLE>


--AFTER UPDATING---

<TABLE>
   <ROW>
      <PolicyId>POL545678NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>2020-03-27T10:56:15.00</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL545678NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3A2</BusinessCoverageCode>
      <TransactionDate>2020-03-27T10:56:15.00</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL111111NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
</TABLE>

——更新前---
POL000002NGJ
COV00002D3X1
2020-03-23T10:56:15.00
POL000002NGJ
COV00002D3X1
2020-03-23T10:56:15.00
Pol111NgJ
COV00002D3X4
2020-03-23T10:56:15.00
--更新后---
POL545678NGJ
COV00002D3X1
2020-03-27T10:56:15.00
POL545678NGJ
COV00002D3A2
2020-03-27T10:56:15.00
Pol111NgJ
COV00002D3X4
2020-03-23T10:56:15.00
我正在使用的代码

import xml.etree.ElementTree as ET
from datetime import datetime, timedelta
import random, string

class TimestampUpdater(object):

    def __init__(self, filepath):
        self.meta_file = filepath
        self.tree = ET.parse(r'C:\Users\\XML\python.xml')

    def getMetadataTree(self):
        return self.tree

    def getMetadataRoot(self):
        return self.tree.getroot()
    
    def updatepolicyid(self):
        for ROW in self.getMetadataRoot().findall('ROW'): ##    
            PolicyId = ROW.find('PolicyId')
            
            if ROW.attrib['PolicyId'] == 'POL702965NGJ':
                
                x = 'POL' + ''.join(random.choices(string.digits, k=6)) + 'NGJ'
                PolicyId.text = x
            #PolicyId.set('updated', 'yes')
            self.getMetadataTree().write(self.meta_file)
                   
    
    def updateLastModified(self):
            today = datetime.now()
            for ROW in self.getMetadataRoot().findall('ROW'): ##
                TransactionDate = ROW.find('TransactionDate')
                previous_update = datetime.strptime(TransactionDate.text, '%Y-%m-%dT%H:%M:%S.%f')
                if previous_update < today:
                    TransactionDate.text = today.strftime('%Y-%m-%dT%H:%M:%S.%f')
                    self.getMetadataTree().write(self.meta_file)
                    
                    

def print_file_content(filename):
    """Print contents of a file"""
    with open(filename, 'r') as fh:
        for line in fh:
            print(line.rstrip())

if __name__ == '__main__':
    metafile = 'output.xml'
    print("\n====Before updating:====")
    print_file_content(metafile)
    updater = TimestampUpdater(metafile)
    updater.updateLastModified()
    updater.updatepolicyid() 
    print("\n====After updating:====")
    print_file_content(metafile)
将xml.etree.ElementTree作为ET导入
从datetime导入datetime,timedelta
导入随机、字符串
类时间戳更新程序(对象):
定义初始化(self,filepath):
self.meta_文件=文件路径
self.tree=ET.parse(r'C:\Users\\XML\python.XML')
def getMetadataTree(自身):
回归自我树
def getMetadataRoot(自身):
返回self.tree.getroot()
def updatepolicyid(自身):
对于self.getMetadataRoot().findall(“行”)中的行:#
PolicyId=ROW.find('PolicyId')
如果ROW.attrib['PolicyId']='POL702965NGJ':
x='POL'+''.join(random.choices(string.digits,k=6))+'NGJ'
PolicyId.text=x
#PolicyId.set('已更新','是')
self.getMetadataTree().write(self.meta_文件)
def updateLastModified(自):
今天=日期时间。现在()
对于self.getMetadataRoot().findall('ROW')中的行:##
TransactionDate=行。查找('TransactionDate')
上一次更新=datetime.strtime(TransactionDate.text,“%Y-%m-%dT%H:%m:%S.%f”)
如果以前的_更新<今天:
TransactionDate.text=today.strftime(“%Y-%m-%dT%H:%m:%S.%f”)
self.getMetadataTree().write(self.meta_文件)
def打印文件内容(文件名):
“”“打印文件的内容”“”
打开(文件名为“r”)作为fh:
对于fh中的线路:
打印(line.rstrip())
如果uuuu name uuuuuu='\uuuuuuu main\uuuuuuu':
元文件='output.xml'
打印(“\n===更新前:==”)
打印文件内容(图元文件)
updater=TimestampUpdater(元文件)
updater.updateLastModified()
updater.updatepolicyid()
打印(“\n===更新后:==”)
打印文件内容(图元文件)

这里有一个相对简单的解决方案,但它使用lxml而不是xml.etree.ElementTree。为了简单起见,我只专注于更改节点值;显然,您必须采用它来满足其他需求

policy = """[your xml above]""" #if parsing from an xml string
from lxml import etree

doc = etree.XML(policy.encode()) #if parsing from an xml string
doc = etree.parse(r'path\to\your\file\policy.xml') #if parsing from a file
replacements = ["some random policy number","some random date"]

targets = doc.xpath('//ROW[PolicyId="POL000002NGJ"]')
for target in targets:
    target.xpath('./PolicyId')[0].text = replacements[0]
    target.xpath('.//TransactionDate')[0].text = replacements[1]
print(etree.tostring(doc).decode())
# or to save to a new file:
doc.write('new_policy.xml', pretty_print=True, xml_declaration=True,   encoding="utf-8")
输出:

<TABLE>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL111111NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
</TABLE>

一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X4
一些随机日期
Pol111NgJ
COV00002D3X4
2020-03-23T10:56:15.00

这里有一个相对简单的解决方案,但它使用lxml而不是xml.etree.ElementTree。为了简单起见,我只专注于更改节点值;显然,您必须采用它来满足其他需求

policy = """[your xml above]""" #if parsing from an xml string
from lxml import etree

doc = etree.XML(policy.encode()) #if parsing from an xml string
doc = etree.parse(r'path\to\your\file\policy.xml') #if parsing from a file
replacements = ["some random policy number","some random date"]

targets = doc.xpath('//ROW[PolicyId="POL000002NGJ"]')
for target in targets:
    target.xpath('./PolicyId')[0].text = replacements[0]
    target.xpath('.//TransactionDate')[0].text = replacements[1]
print(etree.tostring(doc).decode())
# or to save to a new file:
doc.write('new_policy.xml', pretty_print=True, xml_declaration=True,   encoding="utf-8")
输出:

<TABLE>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL111111NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
</TABLE>

一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X4
一些随机日期
Pol111NgJ
COV00002D3X4
2020-03-23T10:56:15.00

下面的代码使用python元素树XML库。(不使用外部库)

它使用xpath并查找具有特定PolicyId值的行

它将日期更新为当前日期

import xml.etree.ElementTree as ET

import datetime

xml = '''<TABLE>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL111111NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
</TABLE>
'''

root = ET.fromstring(xml)
rows_to_update = root.findall(".//ROW/[PolicyId='POL111111NGJ']")
for row in rows_to_update:
    row.find('TransactionDate').text = str(datetime.datetime.now())
ET.dump(root)
将xml.etree.ElementTree作为ET导入
导入日期时间
xml=“”
一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X4
一些随机日期
Pol111NgJ
COV00002D3X4
2020-03-23T10:56:15.00
'''
root=ET.fromstring(xml)
行到更新=root.findall(“.//ROW/[PolicyId='POL111111NGJ']”)
对于行中的行\u到\u更新:
row.find('TransactionDate').text=str(datetime.datetime.now())
ET.dump(根目录)

下面的代码使用python元素树XML库。(不使用外部库)

它使用xpath并查找具有特定PolicyId值的行

它将日期更新为当前日期

import xml.etree.ElementTree as ET

import datetime

xml = '''<TABLE>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X1</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>some random number</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>some random date</TransactionDate>
   </ROW>
   <ROW>
      <PolicyId>POL111111NGJ</PolicyId>
      <BusinessCoverageCode>COV00002D3X4</BusinessCoverageCode>
      <TransactionDate>2020-03-23T10:56:15.00</TransactionDate>
   </ROW>
</TABLE>
'''

root = ET.fromstring(xml)
rows_to_update = root.findall(".//ROW/[PolicyId='POL111111NGJ']")
for row in rows_to_update:
    row.find('TransactionDate').text = str(datetime.datetime.now())
ET.dump(root)
将xml.etree.ElementTree作为ET导入
导入日期时间
xml=“”
一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X1
一些随机日期
一些随机数
COV00002D3X4
一些随机日期
Pol111NgJ
COV00002D3X4
2020-03-23T10:56:15.00
'''
root=ET.fromstring(xml)
行到更新=root.findall(“.//ROW/[PolicyId='POL111111NGJ']”)
对于行中的行\u到\u更新:
row.find('TransactionDate').text=str(datetime.datetime.now())
ET.dump(根目录)

在xml中没有任何属性。如果PolicyId.text='POL702965NGJ':
中没有任何属性,则应检查