Python 从嵌套字典数组中删除数据帧_Python_Pandas_Dictionary

Python 从嵌套字典数组中删除数据帧

python pandas dictionary

Python 从嵌套字典数组中删除数据帧,python,pandas,dictionary,Python,Pandas,Dictionary,如何从上面显示的数据中获得一个好的数据帧？也就是说，foo和bar列在外部显示得很好 import pandas as pd input_thing = [ [{'foo': 1, 'bar': 'a', }], [{'foo': 2, 'bar': 'b', }], ] print(input_thing) 仅将生成具有单个列0的数据帧所需的输出将是： pd.DataFrame(input_thing) 如果input\u thing中的每个子列表都有一个元素： p

如何从上面显示的数据中获得一个好的数据帧？也就是说，

foo

和

bar

列在外部显示得很好

import pandas as pd
input_thing = [
    [{'foo': 1, 'bar': 'a', }], 
    [{'foo': 2, 'bar': 'b', }],
]
print(input_thing)

仅将生成具有单个列0的数据帧

所需的输出将是：

 pd.DataFrame(input_thing)

如果

input\u thing

中的每个子列表都有一个元素：

pd.concat([
    pd.DataFrame(input_thing[0]),
    pd.DataFrame(input_thing[1])
])

印刷品：

foo-bar
01 a
1.2 b

编辑：

印刷品：

foo-bar
01 a
1.2 b

只需将pd.concat替换为for循环：

df = pd.DataFrame(input_thing)
print(df[0].apply(pd.Series))

然后它会得到你想要的：

import pandas as pd
input_thing = [
[{'foo': 1, 'bar': 'a', }], 
[{'foo': 2, 'bar': 'b', }],
]
df = pd.DataFrame([i[0] for i in input_thing])
print(df)

这就是你想要的。

当然可以-但也有矢量化的方法吗？@GeorgHeiler请看我的编辑。我添加了一个带有

pd.Series

的版本，我确信

df.apply（pd.Series）

与列表理解相比非常慢，因为它为列表中的每个项目构建

Series

对象，这增加了太多的开销。你最好使用列表理解。@Ch3steR你比我强：）。。。我打算做一些

timeit

测试。@AndrejKesely哈哈：P你也可以做

pd.DataFrame（chain.from\u iterable（input\u thing））

。不管怎么说，这个答案很好，和安德烈的答案一样。重复的答案无助于社区。如果你有另一个答案，你可以加上它。我只想强调使用一个循环来代替pd.concat。

import pandas as pd
input_thing = [
[{'foo': 1, 'bar': 'a', }], 
[{'foo': 2, 'bar': 'b', }],
]
df = pd.DataFrame([i[0] for i in input_thing])
print(df)

    foo bar
0    1   a
1    2   b