Python 删除数据帧中多次出现的重复值

Python 删除数据帧中多次出现的重复值,python,python-2.7,python-3.x,pandas,Python,Python 2.7,Python 3.x,Pandas,请告诉我如何获取哈希代码在Python中多次出现的imgFileName注意:仅保留第一次出现的值,并删除剩余的值,即使该值出现在中间、最后或任何地方。 我有一个如下所示的数据框: ImgFileName HashCodes Img_0001 - Copy.tif 162a47470f021a60 Img_0001.tif 162a47470f021a60 Img_0002.tif 1b5b5b1aa638dac8 Img_0003.tif

请告诉我如何获取哈希代码在Python中多次出现的imgFileName注意:仅保留第一次出现的值,并删除剩余的值,即使该值出现在中间、最后或任何地方。

我有一个如下所示的数据框:

ImgFileName         HashCodes
Img_0001 - Copy.tif 162a47470f021a60
Img_0001.tif        162a47470f021a60
Img_0002.tif        1b5b5b1aa638dac8
Img_0003.tif        adadadadadadadad
Img_0004.tif        adadadadadadadad
Img_0005 - Copy.tif a5b8648c8c666670
Img_0005.tif        a5b8648c8c666670
Img_0006.tif        71b392da6a699392
Img_0007.tif        71b392da6a699392
Img_0008.tif        b1b1f2fa6bf97292
Img_0009.tif        86e82ae4c8b6c9c9
Img_0010 - Copy.tif 86e8aae4c8b6c9c9
Img_0010.tif        86e8aae4c8b6c9c9
我希望输出如下:

ImgFileName         HashCodes
Img_0001 - Copy.tif 162a47470f021a60
Img_0003.tif        adadadadadadadad
Img_0005 - Copy.tif a5b8648c8c666670
Img_0006.tif        71b392da6a699392
Img_0009.tif        86e82ae4c8b6c9c9
您需要使用-first filter all dupe和second filter last value of dupe或first value of dupe(
keep='last'
):

或:


看,非常感谢你,耶斯雷尔。很高兴能帮上忙!如果我的答案有用,别忘了-点击复选标记(
),将其从灰显切换为填充。谢谢
df =df[ df.duplicated('HashCodes', keep=False) & df.duplicated('HashCodes')]
print (df)
     ImgFileName         HashCodes
1   Img_0001.tif  162a47470f021a60
4   Img_0004.tif  adadadadadadadad
6   Img_0005.tif  a5b8648c8c666670
8   Img_0007.tif  71b392da6a699392
12  Img_0010.tif  86e8aae4c8b6c9c9
df =df[ df.duplicated('HashCodes', keep=False) & df.duplicated('HashCodes', keep='last')]
print (df)
           ImgFileName         HashCodes
0   Img_0001 -Copy.tif  162a47470f021a60
3         Img_0003.tif  adadadadadadadad
5   Img_0005 -Copy.tif  a5b8648c8c666670
7         Img_0006.tif  71b392da6a699392
11  Img_0010 -Copy.tif  86e8aae4c8b6c9c9