Python 将相同字典列表转换为数据帧
我有这样一份清单:Python 将相同字典列表转换为数据帧,python,pandas,dataframe,Python,Pandas,Dataframe,我有这样一份清单: [{'FirstOfficer': '1'}, {'SecondOfficer': '2'}, {'ThirdOfficer': '3'},{'FirstOfficer': '4'}, {'SecondOfficer': '5'}, {'ThirdOfficer': '6'},{'FirstOfficer': '7'}, {'SecondOfficer': '8'}, {'ThirdOfficer': '9'},{'FirstOfficer': '10'}, {'Second
[{'FirstOfficer': '1'}, {'SecondOfficer': '2'}, {'ThirdOfficer': '3'},{'FirstOfficer': '4'}, {'SecondOfficer': '5'}, {'ThirdOfficer': '6'},{'FirstOfficer': '7'}, {'SecondOfficer': '8'}, {'ThirdOfficer': '9'},{'FirstOfficer': '10'}, {'SecondOfficer': '11'}, {'ThirdOfficer': '12'}]
FirstOfficer SecondOfficer ThirdOfficer
0 1 NaN NaN
1 NaN 2 NaN
2 NaN NaN 3
3 4 NaN NaN
4 NaN 5 NaN
5 NaN NaN 6
6 7 NaN NaN
7 NaN 8 NaN
8 NaN NaN 9
9 10 NaN NaN
10 NaN 11 NaN
11 NaN NaN 12
我想将其转换为数据帧,但得到的数据帧如下:
[{'FirstOfficer': '1'}, {'SecondOfficer': '2'}, {'ThirdOfficer': '3'},{'FirstOfficer': '4'}, {'SecondOfficer': '5'}, {'ThirdOfficer': '6'},{'FirstOfficer': '7'}, {'SecondOfficer': '8'}, {'ThirdOfficer': '9'},{'FirstOfficer': '10'}, {'SecondOfficer': '11'}, {'ThirdOfficer': '12'}]
FirstOfficer SecondOfficer ThirdOfficer
0 1 NaN NaN
1 NaN 2 NaN
2 NaN NaN 3
3 4 NaN NaN
4 NaN 5 NaN
5 NaN NaN 6
6 7 NaN NaN
7 NaN 8 NaN
8 NaN NaN 9
9 10 NaN NaN
10 NaN 11 NaN
11 NaN NaN 12
列名称可以是任何名称,因此我无法硬编码它
预期数据帧为:
FirstOfficer SecondOfficer ThirdOfficer
0 1 2 3
1 4 5 6
2 7 8 9
3 10 11 12
有人能给我一个解决办法吗
任何帮助都将不胜感激。使用
defaultdict
存储按字典键列出的值:
from collections import defaultdict
d = defaultdict(list)
for x in L:
a, b = tuple(x.items())[0]
d[a].append(b)
print (d)
df = pd.DataFrame(d)
print (df)
FirstOfficer SecondOfficer ThirdOfficer
0 1 2 3
1 4 5 6
2 7 8 9
3 10 11 12
输出:
FirstOfficer SecondOfficer ThirdOfficer
0 1 2 3
1 4 5 6
2 7 8 9
3 10 11 12
一种方法是预处理您的列表 Ex:
import pandas as pd
lst = [{'FirstOfficer': '1'}, {'SecondOfficer': '2'}, {'ThirdOfficer': '3'},{'FirstOfficer': '4'}, {'SecondOfficer': '5'}, {'ThirdOfficer': '6'},{'FirstOfficer': '7'}, {'SecondOfficer': '8'}, {'ThirdOfficer': '9'},{'FirstOfficer': '10'}, {'SecondOfficer': '11'}, {'ThirdOfficer': '12'}]
data = []
for i in range(0, len(lst), 3):
temp = []
for d in lst[i:i+3]:
for _, v in d.items():
temp.append(v)
data.append(temp)
df = pd.DataFrame(data, columns=["FirstOfficer", "SecondOfficer", "ThirdOfficer"])
print(df)
FirstOfficer SecondOfficer ThirdOfficer
0 1 2 3
1 4 5 6
2 7 8 9
3 10 11 12
输出:
import pandas as pd
lst = [{'FirstOfficer': '1'}, {'SecondOfficer': '2'}, {'ThirdOfficer': '3'},{'FirstOfficer': '4'}, {'SecondOfficer': '5'}, {'ThirdOfficer': '6'},{'FirstOfficer': '7'}, {'SecondOfficer': '8'}, {'ThirdOfficer': '9'},{'FirstOfficer': '10'}, {'SecondOfficer': '11'}, {'ThirdOfficer': '12'}]
data = []
for i in range(0, len(lst), 3):
temp = []
for d in lst[i:i+3]:
for _, v in d.items():
temp.append(v)
data.append(temp)
df = pd.DataFrame(data, columns=["FirstOfficer", "SecondOfficer", "ThirdOfficer"])
print(df)
FirstOfficer SecondOfficer ThirdOfficer
0 1 2 3
1 4 5 6
2 7 8 9
3 10 11 12
如果性能不是问题,您可以使用:
df=pd.DataFrame(l).apply(lambda x: pd.Series(x.dropna().values))
print(df)
嗨,我修改了这个问题。请看一看。@jezrael想不出用这个的方法:(