Pandas 如果重复,则在熊猫中替换str
我试图创建一种方法来搜索列,如果它是一个重复的值,则替换一个字符串。这就是我目前所拥有的Pandas 如果重复,则在熊猫中替换str,pandas,duplicates,Pandas,Duplicates,我试图创建一种方法来搜索列,如果它是一个重复的值,则替换一个字符串。这就是我目前所拥有的 date November 1st 2020 November 2nd 2020 November 3rd 2020 November 1st 2020 November 2nd 2020 November 3rd 2020 November 1st 2020 November 2nd 2020 November 3rd 2020 我想做的是 date November 1st 2020 first in
date
November 1st 2020
November 2nd 2020
November 3rd 2020
November 1st 2020
November 2nd 2020
November 3rd 2020
November 1st 2020
November 2nd 2020
November 3rd 2020
我想做的是
date
November 1st 2020 first instance
November 2nd 2020 first instance
November 3rd 2020 first instance
November 1st 2020 second instance
November 2nd 2020 second instance
November 3rd 2020 second instance
November 1st 2020 third instance
November 2nd 2020 third instance
November 3rd 2020 third instance
有办法做到这一点吗?
这只会将“1个字符”添加到数据帧中,所以您是否会创建一个循环以继续添加1或您希望添加的内容?差不多
pd.Series(['date']).duplicated()
is_duplicate = df.apply(pd.Series.duplicated, axis=1)
for is_duplicate
df.where(~is_duplicate, +1)
我真的不明白如何通过迭代列来获得所需的结果。IIUC,您可以使用
groupby
和cumcount
和map
来实现
df['date'] += ' ' + df.groupby('date').cumcount().map({0:'first ', 1:'second ', 2: 'third '}) + 'instance'
输出:
date
0 November 1st 2020 first instance
1 November 2nd 2020 first instance
2 November 3rd 2020 first instance
3 November 1st 2020 second instance
4 November 2nd 2020 second instance
5 November 3rd 2020 second instance
6 November 1st 2020 third instance
7 November 2nd 2020 third instance
8 November 3rd 2020 third instance
IIUC,您可以使用
groupby
和cumcount
以及map
来完成
df['date'] += ' ' + df.groupby('date').cumcount().map({0:'first ', 1:'second ', 2: 'third '}) + 'instance'
输出:
date
0 November 1st 2020 first instance
1 November 2nd 2020 first instance
2 November 3rd 2020 first instance
3 November 1st 2020 second instance
4 November 2nd 2020 second instance
5 November 3rd 2020 second instance
6 November 1st 2020 third instance
7 November 2nd 2020 third instance
8 November 3rd 2020 third instance