Python解析NMAP XML输出“elem key=“NodeList”

Python解析NMAP XML输出“elem key=“NodeList”,python,xml,parsing,nmap,nodelist,Python,Xml,Parsing,Nmap,Nodelist,我正在尝试解析NMAP xml文件中的特定值。xml文件的部分如下所示: <nmaprun scanner="nmap" args="nmap -A -P0 -oA scanoutput 192.168.1.5" start="1445258532" startstr="Mon Oct 19 08:42:12 2015" version="6.47" xmloutputversion="1.04"> <hostscript> <script

我正在尝试解析NMAP xml文件中的特定值。xml文件的部分如下所示:

<nmaprun scanner="nmap" args="nmap -A -P0 -oA scanoutput 192.168.1.5" start="1445258532" startstr="Mon Oct 19 08:42:12 2015" version="6.47" xmloutputversion="1.04">
    <hostscript>
        <script id="smb-os-discovery" output="&#10;  OS: Windows Server 2008 R2 Standard 7601 Service Pack 1 (Windows Server 2008 R2 Standard 6.1)&#10;  OS CPE: cpe:/o:microsoft:windows_server_2008::sp1&#10;  Computer name: SOMEHOSTNAME&#10;  NetBIOS computer name: SOMEHOSTNAME&#10;  Domain name: domain.local&#10;  Forest name: domain.local&#10;  FQDN: SOMEHOSTNAME.domain.local&#10;  System time: 2015-10-19T08:50:07-04:00&#10;">
            <elem key="os">Windows Server 2008 R2 Standard 7601 Service Pack 1</elem>
            <elem key="lanmanager">Windows Server 2008 R2 Standard 6.1</elem>
            <elem key="server">SOMEHOSTNAME\x00</elem>
            <elem key="date">2015-10-19T08:50:07-04:00</elem>
            <elem key="fqdn">SOMEHOSTNAME.domain.local</elem>
            <elem key="domain_dns">domain.local</elem>
            <elem key="forest_dns">domain.local</elem>
            <elem key="workgroup">HOME\x00</elem>
            <elem key="cpe">cpe:/o:microsoft:windows_server_2008::sp1</elem>
        </script>
    </hostscript>
</nmaprun>
class ScriptResult:
    def __init__( self, script, port_number ):
        self.port_number = port_number
        for k,v in script.attrib.iteritems():
            self.__dict__[k] = v
        return

    def __str__( self ):
        d = '\n'
        for k,v in self.__dict__.iteritems():
            d += '    %-30s :   %s\n' % (k,v)
        return "ScriptResult(%s)\n" % d

class Host:
    def __init__( self ):
        self.script_results = []   # define list of script results
        return

    def print_results( self ):
        for i in self.script_results:
            print i
        return


class XML_Parser:

    def get_hostscripts( self, host, xml_host_element ):
        for hs in xml_host_element.findall('hostscript'):
            for s in hs.findall('script'):
                host.script_results.append( ScriptResult( s, 'host' ) )
如果我将其更改为:

serveros = [script.getElementsByTagName('os') for script in hosttag.getElementsByTagName('script') if script.getAttribute('id') == 'smb-os-discovery']
我得到这个错误:

TypeError: sequence item 0: expected string, NodeList found

提前谢谢

可能是这样的:

<nmaprun scanner="nmap" args="nmap -A -P0 -oA scanoutput 192.168.1.5" start="1445258532" startstr="Mon Oct 19 08:42:12 2015" version="6.47" xmloutputversion="1.04">
    <hostscript>
        <script id="smb-os-discovery" output="&#10;  OS: Windows Server 2008 R2 Standard 7601 Service Pack 1 (Windows Server 2008 R2 Standard 6.1)&#10;  OS CPE: cpe:/o:microsoft:windows_server_2008::sp1&#10;  Computer name: SOMEHOSTNAME&#10;  NetBIOS computer name: SOMEHOSTNAME&#10;  Domain name: domain.local&#10;  Forest name: domain.local&#10;  FQDN: SOMEHOSTNAME.domain.local&#10;  System time: 2015-10-19T08:50:07-04:00&#10;">
            <elem key="os">Windows Server 2008 R2 Standard 7601 Service Pack 1</elem>
            <elem key="lanmanager">Windows Server 2008 R2 Standard 6.1</elem>
            <elem key="server">SOMEHOSTNAME\x00</elem>
            <elem key="date">2015-10-19T08:50:07-04:00</elem>
            <elem key="fqdn">SOMEHOSTNAME.domain.local</elem>
            <elem key="domain_dns">domain.local</elem>
            <elem key="forest_dns">domain.local</elem>
            <elem key="workgroup">HOME\x00</elem>
            <elem key="cpe">cpe:/o:microsoft:windows_server_2008::sp1</elem>
        </script>
    </hostscript>
</nmaprun>
class ScriptResult:
    def __init__( self, script, port_number ):
        self.port_number = port_number
        for k,v in script.attrib.iteritems():
            self.__dict__[k] = v
        return

    def __str__( self ):
        d = '\n'
        for k,v in self.__dict__.iteritems():
            d += '    %-30s :   %s\n' % (k,v)
        return "ScriptResult(%s)\n" % d

class Host:
    def __init__( self ):
        self.script_results = []   # define list of script results
        return

    def print_results( self ):
        for i in self.script_results:
            print i
        return


class XML_Parser:

    def get_hostscripts( self, host, xml_host_element ):
        for hs in xml_host_element.findall('hostscript'):
            for s in hs.findall('script'):
                host.script_results.append( ScriptResult( s, 'host' ) )

您正在使用模块吗?如果是的话,它允许使用XML.DOM.MIDIOMWORD考虑XPath://ELEM[KEY='OS' ],LXML在查询XML和节点和属性方面是有限的。原始海报试图读取Nmap XML文件。这些文件很乱。脚本用他们认为重要的任何信息填充XML。最初的海报试图使用domfindall工具在XML级别处理这个问题。我给出的示例没有在XML级别工作,而是将所有XML放入一个类对象中。然后在较高的层次上,在Python中,您可以取出片段并根据需要格式化它们。特别是如果您使用getattr方法,该方法可以容忍缺失值并提供默认值。self.analyze_script_smb_os_发现项目,s,script_name,script_输出def analyze_script_smb_os_发现自我,项目,s,script_name,script_输出:if script_name!='smb os发现“:返回脚本\u output=script\u output.strip olist=[i.partition':”用于脚本\u output中的i.split'\n']odict=dict[i[0]。strip,i[2]。strip for i in olist]