Python 实体引用和lxml
以下是我的代码:Python 实体引用和lxml,python,xml,lxml,Python,Xml,Lxml,以下是我的代码: from cStringIO import StringIO from lxml import etree xml = StringIO('''<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE root [ <!ENTITY test "This is a test"> ]> <root> <sub>&test;</sub> </root&
from cStringIO import StringIO
from lxml import etree
xml = StringIO('''<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE root [
<!ENTITY test "This is a test">
]>
<root>
<sub>&test;</sub>
</root>''')
d1 = etree.parse(xml)
print '%r' % d1.find('/sub').text
parser = etree.XMLParser(resolve_entities=False)
d2 = etree.parse(xml, parser=parser)
print '%r' % d2.find('/sub').text
如何让lxml给我“&test;”代码>,即原始实体引用?将“未解析”实体保留为元素节点的子节点sub
>>> print d2.find('/sub')[0]
&test;
>>> d2.find('/sub').getchildren()
[&test;]
还有一些事情Ignacio不知道;)
>>> print d2.find('/sub')[0]
&test;
>>> d2.find('/sub').getchildren()
[&test;]