Python 如何将XML文件转换为数据帧?

Python 如何将XML文件转换为数据帧?,python,xml,pandas,dataframe,elementtree,Python,Xml,Pandas,Dataframe,Elementtree,我正在尝试将XML文件转换为以下格式: <ann> <anime id="24235" gid="2583955622" type="TV" name="Love After World Domination" precision="TV" generated-on="2021-04-06T00:15:25Z"> <related-pr

我正在尝试将XML文件转换为以下格式:

<ann>
  <anime id="24235" gid="2583955622" type="TV" name="Love After World Domination" precision="TV" generated-on="2021-04-06T00:15:25Z">
    <related-prev rel="adapted from" id="24234"/>
    <info gid="1661578035" type="Main title" lang="EN">Love After World Domination</info>
    <info gid="2103040388" type="Alternative title" lang="JA">Sekai Seifuku no Ato de</info>
    <info gid="2069464047" type="Alternative title" lang="JA">恋は世界征服のあとで</info>
    <staff gid="1364018953">
    ...
    </staff>
    <staff gid="2582001321">
    ...
    </staff>
  </anime>
  <manga id="24225" gid="1003998999" type="manga" name="She's My Knight" precision="manga" generated-on="2021-04-06T00:21:21Z">
    <info gid="2757138724" type="Picture" src="https://cdn.animenewsnetwork.com/thumbnails/fit200x200/encyc/A24225-2757138724.1617642733.jpg" width="140" height="200">
    ...
    </info>
    <info gid="1643119455" type="Main title" lang="EN">She's My Knight</info>
    <info gid="2475002983" type="Alternative title" lang="JA">Ikemen Kanojo to Heroine na Ore!?</info>
    <info gid="2034824415" type="Alternative title" lang="JA">イケメン彼女とヒロインな俺!?</info>
    <info gid="1694554971" type="Plot Summary">Haruma Ichinose, 17, has been popular since he was born. So popular, in fact, that he figured no one could even come close until he met Yuki Mogami. She's tall, cool, collected, and totally makes him crazy. He may just be in love but falling for someone even more dashing than himself is hard to swallow.</info>
    <info gid="2542157561" type="Vintage">2019 (serialized on Palcy)</info>
    <info gid="851836011" type="Vintage">2019-10-22 (serialized on Palcy)</info>
    <staff gid="307631293">
      <task>Story & Art</task>
      <person id="206223">Saisou</person>
    </staff>
  </manga>
  <anime id="24224" gid="885535394" type="TV" name="Watanuki-san Chi to" precision="TV" generated-on="2021-04-06T00:21:21Z">
  ...
  </anime>
  ...
我还可以使用以下代码获取现有的绘图摘要:

import requests
import pandas as pd
import xml.etree.ElementTree as ET

response = requests.get('https://cdn.animenewsnetwork.com/encyclopedia/api.xml?title=24235/24233/24232/24231/24230/24229/24227/24225/24224/24223/24222/24220/24218/24217/24216/24215/24214/24213/24212/24211/24210/24209/24208/24207/24206/24205/24204/24203/24202/24201/24200/24199/24198/24196/24195/24194/24193/24192/24191/24189/24187/24186/24185/24183/24182/24180/24179/24178/24177/24176/')
root = ET.fromstring(response.text)

dfcols = ['id', 'name']
anime_df = pd.DataFrame(columns=dfcols)
for i in root.iter(tag='anime'):
    anime_df = anime_df.append(
        pd.Series([i.get('id'), i.get('name')], index=dfcols),
        ignore_index=True)
anime_df.head()
plot_list = root.findall('.//info[@type="Plot Summary"]')

for i in range(len(plot_list)):
    print(plot_list[i].text)

但是,由于我使用的是findall,因此无法将绘图摘要与其对应的ID/名称联系起来。有什么想法吗?

我建议您将所有数据拉入字典,并在数据框架中完成最后的工作。比单独创建一个系列和附加更高效

下面我提出的解决方案将
id
name
分别放入字典(defaultdict),同时将
plot summary
拉入另一个字典(
mapping

之后,可以转换为数据结构并合并

from collections import defaultdict
data = defaultdict(list)
mapping = {}

In [142]: for entry in root:
     ...:     data['id'].append(entry.attrib['id'])
     ...:     data['name'].append(entry.attrib['name'])
     ...:     for ent in entry.findall("./info"):
     ...:         if ent.attrib['type'] == "Plot Summary":
     ...:             mapping[entry.attrib['id']] = ent.text


In [150]: pd.DataFrame(data).merge(pd.Series(mapping, name='plot_summary'), 
                                   left_on='id', 
                                   right_index=True, 
                                   how='left')
Out[150]: 
       id                                               name                                       plot_summary
0   24235                        Love After World Domination                                                NaN
1   24233                          Himitsu Kessha Yaruminati                                                NaN
2   24232                          Enman Kaiketsu! Enma-chan                                                NaN
3   24231                          Zenryoku Kaihi Flag-chan!                                                NaN
4   24230                               Konketsu no Karekore                                                NaN
5   24229                                      Teikō Penguin                                                NaN
6   24227                                      Black Channel                                                NaN
7   24225                                    She's My Knight  Haruma Ichinose, 17, has been popular since he...
8   24224                                Watanuki-san Chi to                                                NaN
9   24223                                Watanuki-san Chi no                                                NaN
10  24222                                    Tiger & Bunny 2                                                NaN
11  24220                                          Super Cub                                                NaN
12  24218                                           FUUTO PI                                                NaN
13  24217                                        Fūto Tantei                                                NaN
14  24216                                       Inō no Aicis                                                NaN
15  24215                     Gyakuten Sekai no Denchi Shōjo                                                NaN
16  24214                                     Eiga Yurukyan△                                                NaN
17  24213                            Re:cycle of Penguindrum                                                NaN
18  24212            That Time I Got Reincarnated as a Slime                                                NaN
19  24211                                Wonder Egg Priority                                                NaN
20  24210                                 Dosukoi Sushi-Zumō                                                NaN
21  24209          Motto! Majime ni Fumajime Kaiketsu Zorori                                                NaN
22  24208                                     Pui Pui Molcar                                                NaN
23  24207                              Case Study of Vanitas                                                NaN
24  24206                                              HOME!                                                NaN
25  24205                         Hachimitsu Suicide Machine                                                NaN
26  24204  Deliver Police: Nishitokyo-shi Deliver Keisats...                                                NaN
27  24203                               Ryūsatsu no Kyōkotsu                                                NaN
28  24202                          Muteking the Dancing Hero                                                NaN
29  24201                                      World Trigger                                                NaN
30  24200  Gekijō-ban Utano☆Princesama♪ Maji Love ST☆RISH...                                                NaN
31  24199  My Hero Academia THE MOVIE: World Heroes' Mission                                                NaN
32  24198                            Vampire Dies in No Time                                                NaN
33  24196                                      Visual Prison                                                NaN
34  24195                               IDOLiSH7 Third Beat!  Kujo starts carrying out his plans to defame G...
35  24194                                   Jujutsu Kaisen 0                                                NaN
36  24193                        Gekijō-ban Jujutsu Kaisen 0                                                NaN
37  24192                                           takt op.                                                NaN
38  24191        She Professed Herself Pupil of the Wise Man                                                NaN
39  24189                             Akebi's Sailor Uniform                                                NaN
40  24187                                     Love and Heart  Sure, university freshman Yagisawa has a lot o...
41  24186                                   Do It Yourself!!                                                NaN
42  24185                                   Ningen Kaishūsha                                                NaN
43  24183                           Kanashiki Debu Neko-chan                                                NaN
44  24182                    Ikinuke! Bakusō! Kusohamu-chan!                                                NaN
45  24180                                        Kaiju No. 8                                                NaN
46  24179                                       Phantom Seer                                                NaN
47  24178                      Magu-chan: God of Destruction  The God of Destruction Magu Menueku has been s...
48  24177                                           i tell c                                                NaN
49  24176                 High School Family: Kokosei Kazoku                                                NaN