Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/338.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何使用Python提取/解析字典元素?_Python_Xml_Parsing - Fatal编程技术网

如何使用Python提取/解析字典元素?

如何使用Python提取/解析字典元素?,python,xml,parsing,Python,Xml,Parsing,我想从几十年中提取00,但所有的尝试都没有达到预期的效果 下面是我的XML文件的一部分,另存为gorillas_catalog.XML <CATALOG> <CD decade="00s"> <TITLE>Gorillaz</TITLE> <ARTIST>Gorillaz</ARTIST> <COUNTRY>UK</COUNTRY&

我想从几十年中提取00,但所有的尝试都没有达到预期的效果

下面是我的XML文件的一部分,另存为gorillas_catalog.XML

<CATALOG>
    <CD decade="00s">
        <TITLE>Gorillaz</TITLE>
        <ARTIST>Gorillaz</ARTIST>
        <COUNTRY>UK</COUNTRY>
        <COMPANY>Virgin</COMPANY>
        <PRICE>10.90</PRICE>
        <YEAR>2001</YEAR>
    </CD>
    <CD decade="00s">
        <TITLE>Demon Days</TITLE>
        <ARTIST>Gorillaz</ARTIST>
        <COUNTRY>UK</COUNTRY>
        <COMPANY>Parlaphone</COMPANY>
        <PRICE>9.90</PRICE>
        <YEAR>1988</YEAR>
    </CD>
通过XML文件的其余部分,依此类推

我测试了每个部件,得到如下代码:

import xml.etree.ElementTree as ET

tree = ET.parse("gorillaz_catalog.xml")
root = tree.getroot()

for ARTIST in root.iter("ARTIST"):
    print("Artist:", ARTIST.text)

for TITLE in root.iter("TITLE"):
    print("Title:", TITLE.text)

for decade in root.iter("CD"):
    print("Decade:", decade.attrib)
对于十年,我收到的是我想要的
00s

然后,我尝试循环所有内容以获得我想要的结果(在对上面的3个语句进行注释之后)

我得到的结果循环了20到20次:

Artist: Gorillaz , Album: Gorillaz , Decade: {'decade': 00s'}
二十次(这是文件中记录的数量),然后

二十次

这给了我想要的线路,但我不需要每次20次

  • 很明显,我的嵌套循环是不正确的,那么如何让它产生我想要的行呢?我想我可能需要把这些项目放在字典列表中,但我不太熟悉如何做到这一点

  • 我觉得你把事情弄得有点太复杂了;使用另一个库和xpath进行尝试:

    import lxml.html as lh
    
    cds = """[your html above]"""
    
    doc = lh.fromstring(cds)
    for cd in doc.xpath('//cd'):
        decade = cd.xpath('./@decade')[0]
        title = cd.xpath('./title/text()')[0]
        artist = cd.xpath('./artist/text()')[0]
        print("Title: "+title+", Artist: "+artist+", Decade: "+decade)
    
    输出:

    Title: Gorillaz, Artist: Gorillaz, Decade: 00s
    Title: Demon Days, Artist: Gorillaz, Decade: 00s
    
    > Title: Gorillaz, Album: Gorillaz, Decade: 00s
    > Title: Gorillaz, Album: Demon Days, Decade: 00s
    

    这是我在发布后查看更多文档后的最终代码。谢谢大家的建议

    import xml.etree.ElementTree as ET
    
    tree = ET.parse("gorillaz_catalog.xml")
    root = tree.getroot()
    
    for item in tree.iterfind("CD"):
        artist = item.findtext("ARTIST")
        title = item.findtext("TITLE")
        decade = item.get("decade")
        print(f"Artist: {artist}, Album: {title}, Decade: {decade}")
    
    输出:

    Title: Gorillaz, Artist: Gorillaz, Decade: 00s
    Title: Demon Days, Artist: Gorillaz, Decade: 00s
    
    > Title: Gorillaz, Album: Gorillaz, Decade: 00s
    > Title: Gorillaz, Album: Demon Days, Decade: 00s
    
    > Title: Gorillaz, Album: Gorillaz, Decade: 00s
    > Title: Gorillaz, Album: Demon Days, Decade: 00s