Python 如何从包含重复项的id列表中生成唯一id_Python_Python 2.7_Pandas

Python 如何从包含重复项的id列表中生成唯一id

python python-2.7 pandas

Python 如何从包含重复项的id列表中生成唯一id,python,python-2.7,pandas,Python,Python 2.7,Pandas,我使用pandas包来处理我的数据，我有一个如下所示的数据框架 data = pd.read_csv('people.csv') id, A, B John, 1, 3 Mary, 2, 5 John, 4, 6 John, 3, 7 Mary, 5, 2 我想为这些副本生成唯一的id，但保持它们的相同顺序 id, A, B John, 1, 3 Mary, 2, 5 John.1, 4, 6 John.2, 3, 7 # John shows up three times. Mary.1

我使用pandas包来处理我的数据，我有一个如下所示的数据框架

data = pd.read_csv('people.csv')
id, A, B
John, 1, 3
Mary, 2, 5
John, 4, 6
John, 3, 7
Mary, 5, 2

我想为这些副本生成唯一的id，但保持它们的相同顺序

id, A, B
John, 1, 3
Mary, 2, 5
John.1, 4, 6
John.2, 3, 7   # John shows up three times.
Mary.1, 5, 2   # Mary shows up twice.

我尝试了类似于

set\u index

、

pd.factorize（）

和

index\u col

的方法，但它们都不起作用

为了获得索引，您可以使用：

非零值可附加为：

>>> mask = idx != 0
>>> df.loc[mask, 'id'] += '.' + idx[mask].astype('str')
>>> df
       id  A  B
0    John  1  3
1    Mary  2  5
2  John.1  4  6
3  John.2  3  7
4  Mary.1  5  2

>>> mask = idx != 0
>>> df.loc[mask, 'id'] += '.' + idx[mask].astype('str')
>>> df
       id  A  B
0    John  1  3
1    Mary  2  5
2  John.1  4  6
3  John.2  3  7
4  Mary.1  5  2