Python 如何给熊猫的成对群贴标签?
我有这个数据框:Python 如何给熊猫的成对群贴标签?,python,python-2.7,pandas,pandas-groupby,Python,Python 2.7,Pandas,Pandas Groupby,我有这个数据框: >>> df = pd.DataFrame({'A': [1, 2, 1, np.nan, 2, 2, 2], 'B': [2, 1, 2, 2.0, 1, 1, 2]}) >>> df A B 0 1.0 2.0 1 2.0 1.0 2 1.0 2.0 3 NaN 2.0 4 2.0 1.0 5 2.0 1.0 6 2.0 2.0 我需要在第三列“组id”上标识成对的组(A,B),以获得如下结果
>>> df = pd.DataFrame({'A': [1, 2, 1, np.nan, 2, 2, 2], 'B': [2, 1, 2, 2.0, 1, 1, 2]})
>>> df
A B
0 1.0 2.0
1 2.0 1.0
2 1.0 2.0
3 NaN 2.0
4 2.0 1.0
5 2.0 1.0
6 2.0 2.0
我需要在第三列“组id”上标识成对的组(A,B),以获得如下结果:
>>> df
A B grup id explanation
0 1.0 2.0 1.0 <- group (1.0, 2.0), first group
1 2.0 1.0 2.0 <- group (2.0, 1.0), second group
2 1.0 2.0 1.0 <- group (1.0, 2.0), first group
3 NaN 2.0 NaN <- invalid group
4 2.0 1.0 2.0 <- group (2.0, 1.0), second group
5 2.0 1.0 2.0 <- group (2.0, 1.0), second group
6 2.0 2.0 3.0 <- group (2.0, 2.0), third group
因此,这个groupby()的索引列出了我需要的所有组。但是如何计算它们并将它们映射回我的数据帧?您可以使用(熊猫0.20.2+):
类似于替换-1
和添加1
:
df['grup id'] = df.groupby(['A','B']).ngroup()
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
A B grup id
0 1.0 2.0 1.0
1 2.0 1.0 2.0
2 1.0 2.0 1.0
3 NaN 2.0 NaN
4 2.0 1.0 2.0
5 2.0 1.0 2.0
6 2.0 2.0 3.0
对于最旧版本的pandas
(见下文0.20.2):
您可以使用(0.20.2+):
类似于替换-1
和添加1
:
df['grup id'] = df.groupby(['A','B']).ngroup()
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
A B grup id
0 1.0 2.0 1.0
1 2.0 1.0 2.0
2 1.0 2.0 1.0
3 NaN 2.0 NaN
4 2.0 1.0 2.0
5 2.0 1.0 2.0
6 2.0 2.0 3.0
对于最旧版本的pandas
(见下文0.20.2):
df['grup id'] = df.groupby(['A','B']).ngroup()
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
A B grup id
0 1.0 2.0 1.0
1 2.0 1.0 2.0
2 1.0 2.0 1.0
3 NaN 2.0 NaN
4 2.0 1.0 2.0
5 2.0 1.0 2.0
6 2.0 2.0 3.0
df['grup id'] = df.groupby(["A","B"]).grouper.group_info[0]
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
A B grup id
0 1.0 2.0 1.0
1 2.0 1.0 2.0
2 1.0 2.0 1.0
3 NaN 2.0 NaN
4 2.0 1.0 2.0
5 2.0 1.0 2.0
6 2.0 2.0 3.0