Python 如何根据条件删除XML中的块
我的XML文件包含10000个用户,我需要删除电子邮件中不包含@acme.com的所有用户Python 如何根据条件删除XML中的块,python,xml-parsing,Python,Xml Parsing,我的XML文件包含10000个用户,我需要删除电子邮件中不包含@acme.com的所有用户 <?xml version="1.0" encoding="UTF-8"?> <users type="array"> <user> <id type="integer">14000760626</id> <name> Credenti
<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
<user>
<id type="integer">14000760626</id>
<name> Credentialing Department</name>
<email>user1@acme.com</email>
<created-at type="dateTime">2020-03-26T10:23:34-04:00</created-at>
<updated-at type="dateTime">2020-03-26T10:23:34-04:00</updated-at>
<active type="boolean">false</active>
<job-title></job-title>
<phone>1234567890</phone>
<mobile>1234567890</mobile>
<description></description>
<time-zone>Eastern Time (US & Canada)</time-zone>
<deleted type="boolean">false</deleted>
<language>en</language>
<address></address>
<external-id nil="true"/>
<helpdesk-agent type="boolean">false</helpdesk-agent>
<location-name nil="true"/>
<time-format>12h</time-format>
<company-names type="array"/>
<custom_field>
</custom_field>
</user>
</users>
我还尝试了其他方法,但总是会丢失一些数据,示例结果:
<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
<user>
<id>14000760626</id>
<name> Credentialing Department</name>
<email>test@aoncology.com</email>
<created-at>2020-03-26T10:23:34-04:00</created-at>
<updated-at>2020-03-26T10:23:34-04:00</updated-at>
<active>false</active>
<job-title>None</job-title>
<phone>1234567890</phone>
<mobile>1234567890</mobile>
<description>None</description>
<time-zone>Eastern Time (US & Canada)</time-zone>
<deleted>false</deleted>
<language>en</language>
<address>None</address>
<external-id>None</external-id>
<helpdesk-agent>false</helpdesk-agent>
<location-name>None</location-name>
<time-format>12h</time-format>
<company-names>None</company-names>
<custom_field>
</custom_field>
</user>
</users>
14000760626
认证部
test@aoncology.com
2020-03-26T10:23:34-04:00
2020-03-26T10:23:34-04:00
假的
没有一个
1234567890
1234567890
没有一个
东部时间(美国和加拿大)
假的
EN
没有一个
没有一个
假的
没有一个
12小时
没有一个
如果我理解正确,您正在寻找以下内容:
假设使用简化的XML:
users = """<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
<user>
<id type="integer">14000760626</id>
<name> Credentialing Department</name>
<email>user1@acme.com</email>
</user>
<user>
<id>14000760626</id>
<name> Credentialing Department</name>
<email>test@aoncology.com</email>
</user>
</users>"""
输出:
<users type="array">
<user>
<id type="integer">14000760626</id>
<name> Credentialing Department</name>
<email>user1@acme.com</email>
</user>
</users>
14000760626
认证部
user1@acme.com
从lxml导入etree嘿,杰克-检查,因为我不是终端中的开发人员,所以:1)运行用户=“””
doc = etree.XML(users.encode())
for user in doc.xpath('//users/user'):
if not "acme" in user.xpath('./email')[0].text:
user.getparent().remove(user)
print(etree.tostring(doc).decode())
<users type="array">
<user>
<id type="integer">14000760626</id>
<name> Credentialing Department</name>
<email>user1@acme.com</email>
</user>
</users>