Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/350.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/python-2.7/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何给熊猫的成对群贴标签?_Python_Python 2.7_Pandas_Pandas Groupby - Fatal编程技术网

Python 如何给熊猫的成对群贴标签?

Python 如何给熊猫的成对群贴标签?,python,python-2.7,pandas,pandas-groupby,Python,Python 2.7,Pandas,Pandas Groupby,我有这个数据框: >>> df = pd.DataFrame({'A': [1, 2, 1, np.nan, 2, 2, 2], 'B': [2, 1, 2, 2.0, 1, 1, 2]}) >>> df A B 0 1.0 2.0 1 2.0 1.0 2 1.0 2.0 3 NaN 2.0 4 2.0 1.0 5 2.0 1.0 6 2.0 2.0 我需要在第三列“组id”上标识成对的组(A,B),以获得如下结果

我有这个数据框:

>>> df = pd.DataFrame({'A': [1, 2, 1, np.nan, 2, 2, 2], 'B': [2, 1, 2, 2.0, 1, 1, 2]})
>>> df
     A    B
0  1.0  2.0
1  2.0  1.0
2  1.0  2.0
3  NaN  2.0
4  2.0  1.0
5  2.0  1.0
6  2.0  2.0
我需要在第三列“组id”上标识成对的组(A,B),以获得如下结果:

>>> df
     A    B  grup id                        explanation
0  1.0  2.0      1.0  <- group (1.0, 2.0), first group 
1  2.0  1.0      2.0  <- group (2.0, 1.0), second group
2  1.0  2.0      1.0  <- group (1.0, 2.0), first group 
3  NaN  2.0      NaN  <- invalid group                 
4  2.0  1.0      2.0  <- group (2.0, 1.0), second group
5  2.0  1.0      2.0  <- group (2.0, 1.0), second group
6  2.0  2.0      3.0  <- group (2.0, 2.0), third group 
因此,这个groupby()的索引列出了我需要的所有组。但是如何计算它们并将它们映射回我的数据帧?

您可以使用(熊猫0.20.2+):

类似于替换
-1
和添加
1

df['grup id'] = df.groupby(['A','B']).ngroup()
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
     A    B  grup id
0  1.0  2.0      1.0
1  2.0  1.0      2.0
2  1.0  2.0      1.0
3  NaN  2.0      NaN
4  2.0  1.0      2.0
5  2.0  1.0      2.0
6  2.0  2.0      3.0
对于最旧版本的
pandas
(见下文0.20.2):

您可以使用(0.20.2+):

类似于替换
-1
和添加
1

df['grup id'] = df.groupby(['A','B']).ngroup()
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
     A    B  grup id
0  1.0  2.0      1.0
1  2.0  1.0      2.0
2  1.0  2.0      1.0
3  NaN  2.0      NaN
4  2.0  1.0      2.0
5  2.0  1.0      2.0
6  2.0  2.0      3.0
对于最旧版本的
pandas
(见下文0.20.2):

df['grup id'] = df.groupby(['A','B']).ngroup()
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
     A    B  grup id
0  1.0  2.0      1.0
1  2.0  1.0      2.0
2  1.0  2.0      1.0
3  NaN  2.0      NaN
4  2.0  1.0      2.0
5  2.0  1.0      2.0
6  2.0  2.0      3.0
df['grup id'] = df.groupby(["A","B"]).grouper.group_info[0]
df['grup id'] = np.where(df['grup id'] == -1, np.nan, df['grup id'] + 1)
print (df)
     A    B  grup id
0  1.0  2.0      1.0
1  2.0  1.0      2.0
2  1.0  2.0      1.0
3  NaN  2.0      NaN
4  2.0  1.0      2.0
5  2.0  1.0      2.0
6  2.0  2.0      3.0