Python 3.x 如何使用python查找和更新XML文件内容
比如说Python 3.x 如何使用python查找和更新XML文件内容,python-3.x,xml,beautifulsoup,elementtree,Python 3.x,Xml,Beautifulsoup,Elementtree,比如说 <managedObject class="New" distName="MB-85404/TB-85404/ST-4/a" version="xL20A_1911_002" operation="open"> <p name="a">320ms</p> <p name="b">e
<managedObject class="New" distName="MB-85404/TB-85404/ST-4/a" version="xL20A_1911_002" operation="open">
<p name="a">320ms</p>
<p name="b">enabled</p>
<p name="c">640ms</p>
<p name="d">320ms</p>
<p name="e">640ms</p>
<p name="f">1280ms</p>
<p name="g">6</p>
</managedObject>
<managedObject class="new" distName="AL-76867/MB-85404/TB-85404/ST-4/b" version="xL20A_1911_002" operation="open">
<p name="h">320ms</p>
<p name="i">enabled</p>
<p name="j">640ms</p>
<p name="k">320ms</p>
<p name="l">640ms</p>
<p name="a">1280ms</p>
<p name="l">6</p>
</managedObject>
<managedObject class="New" distName="MB-85404/TB-85404/ST-4/c" version="xL20A_1911_002" operation="open">
<p name="a">320ms</p>
<p name="p">enabled</p>
<p name="q">640ms</p>
<p name="r">320ms</p>
<p name="s">640ms</p>
<p name="t">1280ms</p>
<p name="u">6</p>
</managedObject>
编辑2
通过这段代码,我试图找到与regex loc
匹配的distname
,在managedObject
内部,我试图找到tag
,我想将“4444 New York”
更新为“3444 South texas”
,这给了我下面提到的错误
locat.string="3444 South texas"
AttributeError: 'NoneType' object has no attribute 'string'
我希望我正确理解了你的问题,这将找到所有
distName=“MB-85404/TB-85404/ST-4/[a或b或c]
标记,并替换85409
的85404
和更新
标记:
import re
from bs4 import BeautifulSoup
xml_data = ''' <managedObject class="New" distName="MB-85404/TB-85404/ST-4/a" version="xL20A_1911_002" operation="open">
<p name="a">320ms</p>
<p name="b">enabled</p>
<p name="c">640ms</p>
<p name="d">320ms</p>
<p name="e">640ms</p>
<p name="f">1280ms</p>
<p name="g">6</p>
</managedObject>
<managedObject class="new" distName="AL-76867/MB-85404/TB-85404/ST-4/b" version="xL20A_1911_002" operation="open">
<p name="h">320ms</p>
<p name="i">enabled</p>
<p name="j">640ms</p>
<p name="k">320ms</p>
<p name="l">640ms</p>
<p name="a">1280ms</p>
<p name="l">6</p>
</managedObject>
<managedObject class="New" distName="MB-85404/TB-85404/ST-4/c" version="xL20A_1911_002" operation="open">
<p name="a">320ms</p>
<p name="p">enabled</p>
<p name="q">640ms</p>
<p name="r">320ms</p>
<p name="s">640ms</p>
<p name="t">1280ms</p>
<p name="u">6</p>
</managedObject>'''
soup = BeautifulSoup('<data>' + xml_data + '</data>', 'xml')
r = re.compile(r'^MB-85404/TB-85404/ST-4/(?:a|b|c)')
for o in soup.find_all('managedObject', distName=r):
o['distName'] = o['distName'].replace('85404', '85409')
p = o.find('p', {'name':'a'})
p.string = 'UPDATED ' + p.string
soup.data.unwrap()
print(soup)
EDIT2:要替换整个文件,请执行以下操作:
with open("C:/files/abcd.xml", "r") as f_in:
xml_data = f_in.read()
with open("C:/files/output.xml", "w") as f_out:
f_out.write(xml_data.replace("85409","85904"))
谢谢Andrej,我会尝试一下,我只是有点怀疑,当您更新p标记时,它是否会在
中使用
更新distName=“MB-85404/TB-85404/ST-4/[a或b或c]
,并且如果我想在整个XML文件上更新“85404”到“85409”,无论其位置如何,我该怎么做。?对不起,我问的问题太多了,我是新手。这没关系,我们可以在distName中更改它,但我想知道我们是否可以在整个工作表中更改它,就像在记事本中一样查找并替换所有类型的内容?@AkashRathor然后您可以使用
,例如,将XML读取到变量str.replace
,然后执行XML\u data
并将xml\u data=xml\u数据。替换('85404','85409')
xml\u数据保存到新文件中。
import re
from bs4 import BeautifulSoup
xml_data = ''' <managedObject class="New" distName="MB-85404/TB-85404/ST-4/a" version="xL20A_1911_002" operation="open">
<p name="a">320ms</p>
<p name="b">enabled</p>
<p name="c">640ms</p>
<p name="d">320ms</p>
<p name="e">640ms</p>
<p name="f">1280ms</p>
<p name="g">6</p>
</managedObject>
<managedObject class="new" distName="AL-76867/MB-85404/TB-85404/ST-4/b" version="xL20A_1911_002" operation="open">
<p name="h">320ms</p>
<p name="i">enabled</p>
<p name="j">640ms</p>
<p name="k">320ms</p>
<p name="l">640ms</p>
<p name="a">1280ms</p>
<p name="l">6</p>
</managedObject>
<managedObject class="New" distName="MB-85404/TB-85404/ST-4/c" version="xL20A_1911_002" operation="open">
<p name="a">320ms</p>
<p name="p">enabled</p>
<p name="q">640ms</p>
<p name="r">320ms</p>
<p name="s">640ms</p>
<p name="t">1280ms</p>
<p name="u">6</p>
</managedObject>'''
soup = BeautifulSoup('<data>' + xml_data + '</data>', 'xml')
r = re.compile(r'^MB-85404/TB-85404/ST-4/(?:a|b|c)')
for o in soup.find_all('managedObject', distName=r):
o['distName'] = o['distName'].replace('85404', '85409')
p = o.find('p', {'name':'a'})
p.string = 'UPDATED ' + p.string
soup.data.unwrap()
print(soup)
<?xml version="1.0" encoding="utf-8"?>
<managedObject class="New" distName="MB-85409/TB-85409/ST-4/a" operation="open" version="xL20A_1911_002">
<p name="a">UPDATED 320ms</p>
<p name="b">enabled</p>
<p name="c">640ms</p>
<p name="d">320ms</p>
<p name="e">640ms</p>
<p name="f">1280ms</p>
<p name="g">6</p>
</managedObject>
<managedObject class="new" distName="AL-76867/MB-85404/TB-85404/ST-4/b" operation="open" version="xL20A_1911_002">
<p name="h">320ms</p>
<p name="i">enabled</p>
<p name="j">640ms</p>
<p name="k">320ms</p>
<p name="l">640ms</p>
<p name="a">1280ms</p>
<p name="l">6</p>
</managedObject>
<managedObject class="New" distName="MB-85409/TB-85409/ST-4/c" operation="open" version="xL20A_1911_002">
<p name="a">UPDATED 320ms</p>
<p name="p">enabled</p>
<p name="q">640ms</p>
<p name="r">320ms</p>
<p name="s">640ms</p>
<p name="t">1280ms</p>
<p name="u">6</p>
</managedObject>
for o in soup.find_all('managedObject', {'distName': True}):
o['distName'] = o['distName'].replace('85404', '85409')
with open("C:/files/abcd.xml", "r") as f_in:
xml_data = f_in.read()
with open("C:/files/output.xml", "w") as f_out:
f_out.write(xml_data.replace("85409","85904"))