Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/357.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/xml/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何使用lxml删除XML文档根元素之外的注释_Python_Xml_Lxml - Fatal编程技术网

Python 如何使用lxml删除XML文档根元素之外的注释

Python 如何使用lxml删除XML文档根元素之外的注释,python,xml,lxml,Python,Xml,Lxml,我有一些Python,我正在尝试从各种我无法控制的XML文档中去掉所有注释。这应该能够处理任何有效的XML。以下是迄今为止的代码: tree = lxml.etree.parse(path_to_xml_file) for c in tree.xpath('//comment()'): c.getparent().remove(c) 此代码在此特定XML文件上崩溃: <!-- This comment can't be removed. --> <foo> &l

我有一些Python,我正在尝试从各种我无法控制的XML文档中去掉所有注释。这应该能够处理任何有效的XML。以下是迄今为止的代码:

tree = lxml.etree.parse(path_to_xml_file)
for c in tree.xpath('//comment()'):
  c.getparent().remove(c)
此代码在此特定XML文件上崩溃:

<!-- This comment can't be removed. -->
<foo>
  <!-- This comment can be removed. -->
</foo>


无法删除第一条注释,因为它没有父元素
c.getparent()
对该注释返回
None
。我没有看到关于如何从XML树中删除节点的任何其他文档。这那么,如何删除此注释呢?

如果在使用lxml解析xml之前或之后(这不是很漂亮,但也很有效)使用新标记包装xml,则可以删除此注释

假设您要在lxml之外执行此操作:

comt = """
<!-- This comment can't be removed. -->
<foo>
  <!-- This comment can be removed. -->
</foo>
"""

new_comt = "<super_root>"+comt+"</super_root>"
tree = etree.fromstring(new_comt)
将输出:

    <super_root><foo>
  </foo></super_root>

如有必要,还可以删除包装标签


正如我所说的,不是很优雅,但可以完成这项工作。

若要删除所有注释,请使用with
remove\u comments=True

from lxml import etree

parser = etree.XMLParser(remove_comments=True)
tree = etree.parse("test.xml", parser)

这对带有XML声明的XML文件不起作用澄清了这个问题。
from lxml import etree

parser = etree.XMLParser(remove_comments=True)
tree = etree.parse("test.xml", parser)