Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何将字典展开为大型字典的数据框架?_Python_Python 3.x_Pandas_Dictionary_Dataframe - Fatal编程技术网

Python 如何将字典展开为大型字典的数据框架?

Python 如何将字典展开为大型字典的数据框架?,python,python-3.x,pandas,dictionary,dataframe,Python,Python 3.x,Pandas,Dictionary,Dataframe,考虑一下python3.x中的以下词典 dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}} 我想把它展开成一个数据框架。似乎有两种选择: df1 = pd.DataFrame.from_dict(dict1, orient='columns') print(df1) 4 5 3 42.0 NaN 4 25.0 NaN 5 39.0 NaN 24 N

考虑一下python3.x中的以下词典

dict1 = {4: {4:25, 5:39, 3:42}, 5:{24:94, 252:49, 25:4, 55:923}}
我想把它展开成一个数据框架。似乎有两种选择:

df1 = pd.DataFrame.from_dict(dict1, orient='columns')

print(df1)
        4      5
3    42.0    NaN
4    25.0    NaN
5    39.0    NaN
24    NaN   94.0
25    NaN    4.0
55    NaN  923.0
252   NaN   49.0
其中,该列是主字典键
4
和'5',行索引是子字典键,值是子字典值

另一个选择是

df2 = pd.DataFrame.from_dict(dict1, orient='index')
print(df2)
    4     5     3     24    252  25     55 
4  25.0  39.0  42.0   NaN   NaN  NaN    NaN
5   NaN   NaN   NaN  94.0  49.0  4.0  923.0
其中,列是内部“子字典”的键,行索引是主字典的键,值是子字典键

是否有一种标准方法允许我们按照如下方式展开python字典

key inner_key values
4        3      42 
4        4      25
4        5      39
5        24     94
5        25     4
5        55     923
5        252    49
最好不要在使用dict()中的
之后操作数据帧,对于更大的python字典,这可能会占用大量内存

列表理解 列表理解应该相当有效:

dict1 = {4: {4:25, 5:39, 3:42}, 5: {24:94, 252:49, 25:4, 55:923}}

cols = ['key', 'inner_key', 'values']

df = pd.DataFrame([[k1, k2, v2] for k1, v1 in dict1.items() for k2, v2 in v1.items()],
                  columns=cols).sort_values(cols)

print(df)

   key  inner_key  values
2    4          3      42
0    4          4      25
1    4          5      39
3    5         24      94
5    5         25       4
6    5         55     923
4    5        252      49
+ 如果您不介意从
df1
工作,您可以通过
pd.melt
取消数据帧的IVOT,然后删除值为null的行

df1 = df1.reset_index()

res = pd.melt(df1, id_vars='index', value_vars=[4, 5])\
        .dropna(subset=['value']).astype(int)

print(res)

    index  variable  value
0       3         4     42
1       4         4     25
2       5         4     39
10     24         5     94
11     25         5      4
12     55         5    923
13    252         5     49
输出:

   key  inner_key   values
0   4   4           25
1   4   5           39
2   4   3           42
3   5   24          94
4   5   252         49
5   5   25          4
6   5   55         923

谢谢你的解释!非常感谢
   key  inner_key   values
0   4   4           25
1   4   5           39
2   4   3           42
3   5   24          94
4   5   252         49
5   5   25          4
6   5   55         923