Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/315.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用异常处理的Minidom Python XML解析_Python_Xml - Fatal编程技术网

使用异常处理的Minidom Python XML解析

使用异常处理的Minidom Python XML解析,python,xml,Python,Xml,我正在剥离数百万个XML的敏感数据。我怎样才能添加一个try-and-except来绕过这个错误,这个错误似乎是由于一些格式错误的xml被抛出到一堆文件中而发生的 xml.parsers.expat.expat错误:不匹配的标记:第1行第28691列 #!/usr/bin/python import sys from xml.dom import minidom def getCleanString(word): str = "" dummy = 0

我正在剥离数百万个XML的敏感数据。我怎样才能添加一个try-and-except来绕过这个错误,这个错误似乎是由于一些格式错误的xml被抛出到一堆文件中而发生的

xml.parsers.expat.expat错误:不匹配的标记:第1行第28691列

#!/usr/bin/python
import sys
from xml.dom import minidom

def getCleanString(word):
        str = ""
        dummy = 0
        for character in word:
                try:
                        character = character.encode('utf-8')
                        str = str + character
                except:
                        dummy += 1
        return str

def parsedelete(content):

        dom = minidom.parseString(content)

        for element in dom.getElementsByTagName('RI_RI51_ChPtIncAcctNumber'):
                parentNode = element.parentNode
                parentNode.removeChild(element)

        return dom.toxml()


for line in sys.stdin:
        if line > 1:
                line = line.strip()
                line = line.split(',', 2)
                if len(line) > 2:
                        partition = line[0]
                        id = line[1]
                        xml = line[2]
                        xml = getCleanString(xml)
                        xml = parsedelete(xml)
                        strng = '%s\t%s\t%s' %(partition, id, xml)
                        sys.stdout.write(strng + '\n')

捕获异常非常简单。将
import xml
添加到您的import语句中,并将问题代码包装在try/except处理程序中

def parsedelete(content):
        try:
            dom = minidom.parseString(content)
        except xml.parsers.expat.ExpatError, e:
            # not sure how you want to handle the error... so just passing back as string
            return str(e)

        for element in dom.getElementsByTagName('RI_RI51_ChPtIncAcctNumber'):
                parentNode = element.parentNode
                parentNode.removeChild(element)

        return dom.toxml()
您希望如何“绕过”错误?它发生在代码的何处?