Python 如何根据条件删除XML中的块

Python 如何根据条件删除XML中的块,python,xml-parsing,Python,Xml Parsing,我的XML文件包含10000个用户,我需要删除电子邮件中不包含@acme.com的所有用户 <?xml version="1.0" encoding="UTF-8"?> <users type="array"> <user> <id type="integer">14000760626</id> <name> Credenti

我的XML文件包含10000个用户,我需要删除电子邮件中不包含@acme.com的所有用户

<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
  <user>
    <id type="integer">14000760626</id>
    <name> Credentialing Department</name>
    <email>user1@acme.com</email>
    <created-at type="dateTime">2020-03-26T10:23:34-04:00</created-at>
    <updated-at type="dateTime">2020-03-26T10:23:34-04:00</updated-at>
    <active type="boolean">false</active>
    <job-title></job-title>
    <phone>1234567890</phone>
    <mobile>1234567890</mobile>
    <description></description>
    <time-zone>Eastern Time (US &amp; Canada)</time-zone>
    <deleted type="boolean">false</deleted>
    <language>en</language>
    <address></address>
    <external-id nil="true"/>
    <helpdesk-agent type="boolean">false</helpdesk-agent>
    <location-name nil="true"/>
    <time-format>12h</time-format>
    <company-names type="array"/>
    <custom_field>
    </custom_field>
  </user>
</users>
我还尝试了其他方法,但总是会丢失一些数据,示例结果:

<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
<user>
<id>14000760626</id>
<name> Credentialing Department</name>
<email>test@aoncology.com</email>
<created-at>2020-03-26T10:23:34-04:00</created-at>
<updated-at>2020-03-26T10:23:34-04:00</updated-at>
<active>false</active>
<job-title>None</job-title>
<phone>1234567890</phone>
<mobile>1234567890</mobile>
<description>None</description>
<time-zone>Eastern Time (US & Canada)</time-zone>
<deleted>false</deleted>
<language>en</language>
<address>None</address>
<external-id>None</external-id>
<helpdesk-agent>false</helpdesk-agent>
<location-name>None</location-name>
<time-format>12h</time-format>
<company-names>None</company-names>
<custom_field>
    </custom_field>
</user>

</users>

14000760626
认证部
test@aoncology.com
2020-03-26T10:23:34-04:00
2020-03-26T10:23:34-04:00
假的
没有一个
1234567890
1234567890
没有一个
东部时间(美国和加拿大)
假的
EN
没有一个
没有一个
假的
没有一个
12小时
没有一个

如果我理解正确,您正在寻找以下内容:

假设使用简化的XML:

users = """<?xml version="1.0" encoding="UTF-8"?>
<users type="array">
  <user>
    <id type="integer">14000760626</id>
    <name> Credentialing Department</name>
    <email>user1@acme.com</email>      
  </user>
  <user>
    <id>14000760626</id>
    <name> Credentialing Department</name>
    <email>test@aoncology.com</email>
   </user>
</users>"""
输出:

<users type="array">
  <user>
    <id type="integer">14000760626</id>
    <name> Credentialing Department</name>
    <email>user1@acme.com</email>      
  </user>
  </users>

14000760626
认证部
user1@acme.com      

从lxml导入etree

嘿,杰克-检查,因为我不是终端中的开发人员,所以:1)运行用户=“””
doc = etree.XML(users.encode())
for user in doc.xpath('//users/user'):        
    if not "acme" in user.xpath('./email')[0].text:
        user.getparent().remove(user)
print(etree.tostring(doc).decode())
<users type="array">
  <user>
    <id type="integer">14000760626</id>
    <name> Credentialing Department</name>
    <email>user1@acme.com</email>      
  </user>
  </users>