Python Pandas和Dictionary:将Dict转换为DataFrame,并将值中的内部键用作DataFrame列标题
我有以下字典:Python Pandas和Dictionary:将Dict转换为DataFrame,并将值中的内部键用作DataFrame列标题,python,pandas,dictionary,Python,Pandas,Dictionary,我有以下字典: { 0: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}], 1: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}], 2: [{1: 0.21150571615476177}, {2: 0.20021993193784904}, {3: 0.246
{
0: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}],
1: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}],
2: [{1: 0.21150571615476177}, {2: 0.20021993193784904}, {3: 0.24673408701244148}, {4: 0.26073319330403394}, {5: 0.0}, {6: 0.27012912297379343}, {7: 0.0}, {8: 0.0}],
3: [{1: 0.2786416467397351}, {2: 0.2006495239101905}, {3: 0.21600480247194567}, {4: 0.25724906204967557}, {5: 0.0}, {6: 0.26817162148227375}, {7: 0.0}, {8: 0.0}],
4: [{1: 0.2755030949011681}, {2: 0.20315735111595443}, {3: 0.21705903867972787}, {4: 0.2564000954604151}, {5: 0.0}, {6: 0.26903863724054405}, {7: 0.0}, {8: 0.0}],
5: [{1: 0.27334751895045045}, {2: 0.2012256178641117}, {3: 0.22266330432504813}, {4: 0.25925509529304697}, {6: 0.27562843736621906}],
6: [{1: 0.27739942084587565}, {2: 0.198682325880847}, {3: 0.2169017627591854}, {4: 0.25843774856843105}, {6: 0.26996683786070946}],
7: [{1: 0.2726461255684456}, {2: 0.19778567408338052}, {3: 0.2197858176643358}, {4: 0.26053721842016453}, {6: 0.26812789513005875}]
}
如何将此字典转换为数据帧,并确保每个值中的内键都是对应行值的列标题?请注意,在第5、6和7行中,缺少内部键5、7和8的值,这意味着我需要以下方式的数据帧:
1 2 3 4 5 6 7 8
0 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
2 0.211651 0.202256 0.244509 0.256969 0.000000 0.275521 0.0 0.0
3 0.273670 0.199995 0.222494 0.256303 0.000000 0.275037 0.0 0.0
4 0.280948 0.200235 0.218654 0.256737 0.000000 0.276424 0.0 0.0
5 0.281718 0.197531 0.217461 0.256043 NaN 0.271181 NaN NaN
6 0.279024 0.200089 0.218020 0.261419 NaN 0.272113 NaN NaN
7 0.278222 0.203448 0.219254 0.261846 NaN 0.269600 NaN NaN
(这些值是任意的,它们是什么并不重要)。除了我知道使用
pd.to_CSV()
感谢您的帮助。提前感谢。
(使用Ubuntu 14.04 32位虚拟机和Python 2.7) p.S.一个类似的问题没有得到回答,因为它让其他用户因为没有正确地构建句子而感到困惑。它已被删除。
我希望这个问题清楚而准确。使用列表理解,然后按第二级对所有列进行小规模的汇总,将所有非
NaN
s列连接起来:
df = pd.concat({k: pd.DataFrame(v) for k,v in d.items()}, 1).stack().T.sum(level=1, axis=1)
print (df)
1 2 3 4 5 6 7 8
0 0.000000 0.000000 0.000000 0.000000 0.0 0.000000 0.0 0.0
1 0.000000 0.000000 0.000000 0.000000 0.0 0.000000 0.0 0.0
2 0.211506 0.200220 0.246734 0.260733 0.0 0.270129 0.0 0.0
3 0.278642 0.200650 0.216005 0.257249 0.0 0.268172 0.0 0.0
4 0.275503 0.203157 0.217059 0.256400 0.0 0.269039 0.0 0.0
5 0.273348 0.201226 0.222663 0.259255 NaN 0.275628 NaN NaN
6 0.277399 0.198682 0.216902 0.258438 NaN 0.269967 NaN NaN
7 0.272646 0.197786 0.219786 0.260537 NaN 0.268128 NaN NaN
详情:
print (pd.concat({k: pd.DataFrame(v) for k,v in d.items()}, 1).stack().T)
0 1 2 3 4 5 6 7
1 2 3 4 5 6 6 7 8
0 0.000000 0.000000 0.000000 0.000000 0.0 NaN 0.000000 0.0 0.0
1 0.000000 0.000000 0.000000 0.000000 0.0 NaN 0.000000 0.0 0.0
2 0.211506 0.200220 0.246734 0.260733 0.0 NaN 0.270129 0.0 0.0
3 0.278642 0.200650 0.216005 0.257249 0.0 NaN 0.268172 0.0 0.0
4 0.275503 0.203157 0.217059 0.256400 0.0 NaN 0.269039 0.0 0.0
5 0.273348 0.201226 0.222663 0.259255 NaN 0.275628 NaN NaN NaN
6 0.277399 0.198682 0.216902 0.258438 NaN 0.269967 NaN NaN NaN
7 0.272646 0.197786 0.219786 0.260537 NaN 0.268128 NaN NaN NaN
终于,我得到了我想要的。非常感谢@jezrael