Python 从excel文件（数据框架）创建字典_Python_Pandas_Dictionary

Python 从excel文件（数据框架）创建字典

python pandas dictionary

Python 从excel文件（数据框架）创建字典,python,pandas,dictionary,Python,Pandas,Dictionary,我得到的excel/dataframe/文件如下所示： +------+--------+ | ID | 2nd ID | +------+--------+ | ID_1 | R_1 | | ID_1 | R_2 | | ID_2 | R_3 | | ID_3 | | | ID_4 | R_4 | | ID_5 | | +------+--------+ 如何将其转换为python字典？我希望我的结果是： {'ID_1':['R_1',

我得到的excel/dataframe/文件如下所示：

+------+--------+
|  ID  | 2nd ID |
+------+--------+
| ID_1 |  R_1   |
| ID_1 |  R_2   |
| ID_2 |  R_3   |
| ID_3 |        |
| ID_4 |  R_4   |
| ID_5 |        |
+------+--------+

如何将其转换为python字典？我希望我的结果是：

{'ID_1':['R_1','R_2'],'ID_2':['R_3'],'ID_3':[],'ID_4':['R_4'],'ID_5':[]}

我应该怎么做才能获得它？

如果需要删除lambda函数中不存在的值的缺失值，请在：

或者在列表比较中使用事实

np.nan==np.nan

return

False

来筛选未缺失的值，也可以在中查看

warning

以获得更多解释

d = df.groupby('ID')['2nd ID'].apply(lambda x: [y for y in x if y == y]).to_dict()

如果需要删除空字符串：

d = df.groupby('ID')['2nd ID'].apply(lambda x: [y for y in x if y != '']).to_dict()

在数据框上的行上应用一个函数，该行将值附加到dict。Apply不在原位，因此将创建字典

d = dict.fromkeys(df.ID.unique(), [])

def func(x):
  
  d[x.ID].append(x["2nd ID"])

# will return a series of Nones
df.apply(func, axis = 1)

编辑：

我在Gitter上问了这个问题，@gurukiran07给了我一个答案。您要做的是与爆炸函数相反

s = pd.Series([[1, 2, 3], [4, 5]])

0    [1, 2, 3]
1       [4, 5]
dtype: object

exploded = s.explode()

0    1
0    2
0    3
1    4
1    5
dtype: object

exploded.groupby(level=0).agg(list)

0    [1, 2, 3]
1       [4, 5]
dtype: object

这回答了你的问题吗@sushanth-否，因为缺少值。您要做的是和PandaSple中的分解相反，请不要只发布代码作为答案。请解释您的答案/实施。

s = pd.Series([[1, 2, 3], [4, 5]])

0    [1, 2, 3]
1       [4, 5]
dtype: object

exploded = s.explode()

0    1
0    2
0    3
1    4
1    5
dtype: object

exploded.groupby(level=0).agg(list)

0    [1, 2, 3]
1       [4, 5]
dtype: object