用Python元素树解析XML

用Python元素树解析XML,python,xml,parsing,python-2.7,elementtree,Python,Xml,Parsing,Python 2.7,Elementtree,我有一个很长的XML文档,其结构如下: <carrierData> <inspections> <inspection inspection_date="2013-01-16" report_state="TX" report_number="TX130G0ELJ05" level="1" time_weight="1"> <drivers> <driver drive

我有一个很长的XML文档,其结构如下:

<carrierData>
    <inspections>
        <inspection inspection_date="2013-01-16" report_state="TX" report_number="TX130G0ELJ05" level="1" time_weight="1">
           <drivers>
              <driver driver_type="Primary Driver" first_name="JOHN" last_name="SMITH" date_of_birth="1962-11-20" license_state="TX" License_number="12345678"/>
              <driver driver_type="CoDriver"/>
           </drivers>
           <vehicles>
               <vehicle unit="1" vehicle_id_number="2HSCAAXN02C039269" unit_type="Truck Tractor" license_state="TX" license_number="1B13577"/>
               <vehicle unit="2" vehicle_id_number="1GRAA76228S702393" unit_type="Semi-Trailer" license_state="TX" license_number="X99757"/>
           </vehicles>
           <violations>
               <violation code="393.11" description="No/defective lighting devices/reflective devices/projected" oos="N" time_severity_weight="3" BASIC="Vehicle Maint."/>
               <violation code="393.53(b)" description="Automatic brake adjuster CMV manufactured on or after 10/20/1994 - air brake" oos="N" time_severity_weight="4" BASIC="Vehicle Maint."/>
               <violation code="393.47(e)" description="Clamp/Roto-Chamber type brake(s) out of adjustment" oos="N" time_severity_weight="4" BASIC="Vehicle Maint."/>
               <violation code="396.3(a)(1)" description="Inspection/repair and maintenance parts and accessories" oos="N" time_severity_weight="2" BASIC="Vehicle Maint."/>
           </violations>
    </inspection>
for x in codes:
    for node in tree.iter('inspection'):
        if node.attrib['report_number'] == x:
            primary_driver = [d for d in node.iter('driver') if d.attrib['driver_type'] == "Primary Driver"]
            primary_driver = primary_driver[0]
            first_name = primary_driver.attrib['first_name']
            last_name = primary_driver.attrib['last_name']
            print first_name, last_name

我是一名编程新手,因此我可能遗漏了一些显而易见的内容,但如果没有引用错误,我很难追踪问题。

您对这一行有何打算

if ['report_id'] == [x]:
使用此代码,您正在测试
['report\u id']==['TX3YZ8HQE1X1']
['report\u id']==['TX3YAEHQE15W']
,等等,这永远不会是真的。因此,这就是为什么您的代码在退出时没有打印任何内容或给出错误

您发布的XML中没有任何名为
report\u id
的内容,您是指
report\u number

如果要获取
代码
列表中所有
报告编号
的主要驱动程序的名字,请尝试以下操作:

<carrierData>
    <inspections>
        <inspection inspection_date="2013-01-16" report_state="TX" report_number="TX130G0ELJ05" level="1" time_weight="1">
           <drivers>
              <driver driver_type="Primary Driver" first_name="JOHN" last_name="SMITH" date_of_birth="1962-11-20" license_state="TX" License_number="12345678"/>
              <driver driver_type="CoDriver"/>
           </drivers>
           <vehicles>
               <vehicle unit="1" vehicle_id_number="2HSCAAXN02C039269" unit_type="Truck Tractor" license_state="TX" license_number="1B13577"/>
               <vehicle unit="2" vehicle_id_number="1GRAA76228S702393" unit_type="Semi-Trailer" license_state="TX" license_number="X99757"/>
           </vehicles>
           <violations>
               <violation code="393.11" description="No/defective lighting devices/reflective devices/projected" oos="N" time_severity_weight="3" BASIC="Vehicle Maint."/>
               <violation code="393.53(b)" description="Automatic brake adjuster CMV manufactured on or after 10/20/1994 - air brake" oos="N" time_severity_weight="4" BASIC="Vehicle Maint."/>
               <violation code="393.47(e)" description="Clamp/Roto-Chamber type brake(s) out of adjustment" oos="N" time_severity_weight="4" BASIC="Vehicle Maint."/>
               <violation code="396.3(a)(1)" description="Inspection/repair and maintenance parts and accessories" oos="N" time_severity_weight="2" BASIC="Vehicle Maint."/>
           </violations>
    </inspection>
for x in codes:
    for node in tree.iter('inspection'):
        if node.attrib['report_number'] == x:
            primary_driver = [d for d in node.iter('driver') if d.attrib['driver_type'] == "Primary Driver"]
            primary_driver = primary_driver[0]
            first_name = primary_driver.attrib['first_name']
            last_name = primary_driver.attrib['last_name']
            print first_name, last_name

但是,这段代码有一个性能问题。您正在对
code
中的每一个代码循环整个XML文档。这有
O(代码的数量*记录的数量)
,即
O(N**2)
。您可以在
O(N)
步骤中执行此操作,方法是在文档上循环一次,然后使用a来确定是否应包含记录。

谢谢您,先生!成功了!我的代码中确实有正确的“report\u number”属性,但我想当我第一次键入它时,我的代码中有“report\u id”,所以很抱歉这里的混乱。否则,它给了我我所需要的,看到这里的答案给了我一个更精确的理解,我试图做什么。上面还建议使用set()函数,我确实实现了它,效果非常好。再次感谢!