使用python从多次出现的xml文件中删除特定元素
大家好,我的社区 如果能给我一些指导,帮助我使用Python和elementTree库调整XML文件,我将不胜感激 目前,我有以下file.xml文件:使用python从多次出现的xml文件中删除特定元素,python,lxml,elementtree,Python,Lxml,Elementtree,大家好,我的社区 如果能给我一些指导,帮助我使用Python和elementTree库调整XML文件,我将不胜感激 目前,我有以下file.xml文件: <component xmlns:xsi="http://www.w3.orgr"> <memoryMaps> <memoryMap> <name>name</name> <description>descriptio
<component xmlns:xsi="http://www.w3.orgr">
<memoryMaps>
<memoryMap>
<name>name</name>
<description>description</description>
<peripheral>
<name>periph</name>
<description>description</description>
<baseAddress>0x0</baseAddress>
<range>0x8</range>
<width>32</width>
<registers>
<register>
<name>reg1</name>
<displayName>reg1</displayName>
<description>This is register 1</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000002</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields>
.................
</fields>
<resetValue>0x00000002</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<description>This is register 1</description>
</register>
<register>
<name>reg2</name>
<displayName>reg2</displayName>
<description>This is register 2</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000000</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields>
.................
</fields>
<resetValue>0x00000000</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<description>This is register 2</description>
</register>
<register>
..................
</register>
</registers>
</peripheral>
</memoryMap>
</memoryMaps>
</component>
名称
描述
外围
描述
0x0
0x8
32
reg1
reg1
这是寄存器1
0x0
32
读写
0x00000002
0xFFFFFF
.................
0x00000002
0xFFFFFF
这是寄存器1
reg2
reg2
这是登记册2
0x0
32
读写
0x00000000
0xFFFFFF
.................
0x00000000
0xFFFFFF
这是登记册2
..................
正如您在每个“register”末尾看到的,我有3个元素:“resetValue”、“resetMask”和“description”,它们在“register”之前也存在于其他位置,但我想删除始终存在于节点“register”末尾的三个元素,我不能使用remove元素,因为我想在之前保留该元素('resetValue'、'resetMask'和'description')
我希望能有一个这样的结果:
<registers>
<register>
<name>reg1</name>
<displayName>reg1</displayName>
<description>This is register 1</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000002</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields>
.................
</fields>
</register>
<register>
<name>reg2</name>
<displayName>reg2</displayName>
<description>This is register 2</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000000</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields>
.................
</fields>
</register>
<register>
..................
</register>
</registers>
reg1
reg1
这是寄存器1
0x0
32
读写
0x00000002
0xFFFFFF
.................
reg2
reg2
这是登记册2
0x0
32
读写
0x00000000
0xFFFFFF
.................
..................
是否可以只删除xml中的最后一个元素?是否有索引解决方案,因为位置始终相同(每个“寄存器”末尾的最后3个元素),但元素之前存在?
请告诉我
谢谢以下是使用lxml XPath API的解决方案:
from lxml import etree as et
import itertools
# here variable xml is a string with the source XML from your question
root = et.fromstring(xml)
# create list of elements to be deleted
# itertools.chain() here creates one list from 3 lists returned by xpath() function calls
# predicate [last()] ensures that we select last element
elements = itertools.chain(
root.xpath("//register/resetValue[last()]"),
root.xpath("//register/resetMask[last()]"),
root.xpath("//register/description[last()]"))
# iterate over selected elements and delete them
for el in elements:
el.getparent().remove(el)
# output result
print (et.tostring(root, pretty_print=True, xml_declaration=True).decode('utf8'))
输出:
<?xml version='1.0' encoding='ASCII'?>
<component xmlns:xsi="http://www.w3.orgr">
<memoryMaps>
<memoryMap>
<name>name</name>
<description>description</description>
<peripheral>
<name>periph</name>
<description>description</description>
<baseAddress>0x0</baseAddress>
<range>0x8</range>
<width>32</width>
<registers>
<register>
<name>reg1</name>
<displayName>reg1</displayName>
<description>This is register 1</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000002</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields/>
</register>
<register>
<name>reg2</name>
<displayName>reg2</displayName>
<description>This is register 2</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000000</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields/>
</register>
<register>
</register>
</registers>
</peripheral>
</memoryMap>
</memoryMaps>
</component>
名称
描述
外围
描述
0x0
0x8
32
reg1
reg1
这是寄存器1
0x0
32
读写
0x00000002
0xFFFFFF
reg2
reg2
这是登记册2
0x0
32
读写
0x00000000
0xFFFFFF
以下是使用lxml XPath API的解决方案:
from lxml import etree as et
import itertools
# here variable xml is a string with the source XML from your question
root = et.fromstring(xml)
# create list of elements to be deleted
# itertools.chain() here creates one list from 3 lists returned by xpath() function calls
# predicate [last()] ensures that we select last element
elements = itertools.chain(
root.xpath("//register/resetValue[last()]"),
root.xpath("//register/resetMask[last()]"),
root.xpath("//register/description[last()]"))
# iterate over selected elements and delete them
for el in elements:
el.getparent().remove(el)
# output result
print (et.tostring(root, pretty_print=True, xml_declaration=True).decode('utf8'))
输出:
<?xml version='1.0' encoding='ASCII'?>
<component xmlns:xsi="http://www.w3.orgr">
<memoryMaps>
<memoryMap>
<name>name</name>
<description>description</description>
<peripheral>
<name>periph</name>
<description>description</description>
<baseAddress>0x0</baseAddress>
<range>0x8</range>
<width>32</width>
<registers>
<register>
<name>reg1</name>
<displayName>reg1</displayName>
<description>This is register 1</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000002</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields/>
</register>
<register>
<name>reg2</name>
<displayName>reg2</displayName>
<description>This is register 2</description>
<addressOffset>0x0</addressOffset>
<size>32</size>
<access>read-write</access>
<resetValue>0x00000000</resetValue>
<resetMask>0xFFFFFFFF</resetMask>
<fields/>
</register>
<register>
</register>
</registers>
</peripheral>
</memoryMap>
</memoryMaps>
</component>
名称
描述
外围
描述
0x0
0x8
32
reg1
reg1
这是寄存器1
0x0
32
读写
0x00000002
0xFFFFFF
reg2
reg2
这是登记册2
0x0
32
读写
0x00000000
0xFFFFFF