Python lxml.etree.XMLSyntaxError:内部错误:大量输入查找
我正在尝试使用pythonPython lxml.etree.XMLSyntaxError:内部错误:大量输入查找,python,xml,xml-parsing,lxml,Python,Xml,Xml Parsing,Lxml,我正在尝试使用pythonlxml库,即iterparse,解析一个大的xml文件(~500 MB),使用: context = etree.iterparse('large-file.xml') for event, element in context: # do some stuff element.clear() 但它返回以下错误: Traceback (most recent call last): File "test.py", line 176, in &l
lxml
库,即iterparse
,解析一个大的xml文件(~500 MB),使用:
context = etree.iterparse('large-file.xml')
for event, element in context:
# do some stuff
element.clear()
但它返回以下错误:
Traceback (most recent call last):
File "test.py", line 176, in <module> test_parser()
File "test.py", line 121, in test_parser
for event, element in context:
File "src/lxml/iterparse.pxi", line 208, in lxml.etree.iterparse.__next__ (src/lxml/etree.c:155963)
File "src/lxml/iterparse.pxi", line 193, in lxml.etree.iterparse.__next__ (src/lxml/etree.c:155671)
File "src/lxml/iterparse.pxi", line 228, in lxml.etree.iterparse._read_more_events (src/lxml/etree.c:156298)
File "src/lxml/parser.pxi", line 1362, in lxml.etree._FeedParser.feed (src/lxml/etree.c:116552)
File "src/lxml/parser.pxi", line 589, in lxml.etree._ParserContext._handleParseResult (src/lxml/etree.c:107619)
File "src/lxml/parser.pxi", line 598, in lxml.etree._ParserContext._handleParseResultDoc (src/lxml/etree.c:107738)
File "src/lxml/parser.pxi", line 709, in lxml.etree._handleParseResult (src/lxml/etree.c:109447)
File "src/lxml/parser.pxi", line 638, in lxml.etree._raiseParseError (src/lxml/etree.c:108301)
File "large-file.xml", line 20593
lxml.etree.XMLSyntaxError: internal error: Huge input lookup, line 20593, column 199
回溯(最近一次呼叫最后一次):
test_parser()中第176行的文件“test.py”
test_解析器中的文件“test.py”,第121行
对于事件,上下文中的元素:
文件“src/lxml/iterparse.pxi”,第208行,在lxml.etree.iterparse.\uuuu next\uuuuu(src/lxml/etree.c:155963)中
文件“src/lxml/iterparse.pxi”,第193行,在lxml.etree.iterparse.\uuuuu next\uuuuu(src/lxml/etree.c:155671)中
文件“src/lxml/iterparse.pxi”,第228行,在lxml.etree.iterparse.读取更多事件(src/lxml/etree.c:156298)
lxml.etree.\u FeedParser.feed(src/lxml/etree.c:116552)中的文件“src/lxml/parser.pxi”,第1362行
文件“src/lxml/parser.pxi”,第589行,在lxml.etree.\u ParserContext.\u handleParseResult(src/lxml/etree.c:107619)中
文件“src/lxml/parser.pxi”,第598行,在lxml.etree.\u ParserContext.\u handleParseResultDoc(src/lxml/etree.c:107738)中
文件“src/lxml/parser.pxi”,第709行,在lxml.etree中。\u handleParseResult(src/lxml/etree.c:109447)
文件“src/lxml/parser.pxi”,第638行,在lxml.etree中。\u raiseParserError(src/lxml/etree.c:108301)
文件“large File.xml”,第20593行
lxml.etree.XMLSyntaxError:内部错误:巨大输入查找,第20593行,第199列
它可能的副本有点不同,因为我在这里使用了iterparse
。但是使用那里的解决方案,我可以通过context=etree.iterparse('large-file.xml',maging\u tree=True)