Python 熊猫系列消除重复问题_Python_Pandas_Series

Python 熊猫系列消除重复问题

python pandas

Python 熊猫系列消除重复问题,python,pandas,series,Python,Pandas,Series,我有一个重复的系列，我正试图摆脱 0 RWAY001 1 RWAY001 2 RWAY002 3 RWAY002 ... 112 RWAY057 113 RWAY057 114 RWAY058 115 RWAY058 Length: 116 Drop.duplicates（）似乎将长度减少到了58，但索引似乎仍然从0变为116，只是跳过了重复项： 0 RWAY001 2 RWAY002 ... 112 RWAY0

我有一个重复的系列，我正试图摆脱

0     RWAY001
1     RWAY001
2     RWAY002
3     RWAY002
...
112    RWAY057
113    RWAY057
114    RWAY058
115    RWAY058
Length: 116

Drop.duplicates（）似乎将长度减少到了58，但索引似乎仍然从0变为116，只是跳过了重复项：

0      RWAY001
2      RWAY002
...
112    RWAY057
114    RWAY058
Length: 58

因此，中间的行似乎仍然存在，且具有NaN值。我尝试了dropna（），但它对数据没有任何影响

这是我的代码：

  df = pd.read_csv(path + flnm)
  fields = df.file
  fields = fields.drop_duplicates()
  print fields

非常感谢您的帮助。谢谢。

我想您需要参数

drop=True

：

fields.reset_index(inplace=True, drop=True)

或：

样本：

import pandas as pd

df = pd.DataFrame({'file': {0: 'RWAY001', 1: 'RWAY001', 2: 'RWAY002', 3: 'RWAY002', 115: 'RWAY058', 113: 'RWAY057', 112: 'RWAY057', 114: 'RWAY058'}})
print (df)
        file
0    RWAY001
1    RWAY001
2    RWAY002
3    RWAY002
112  RWAY057
113  RWAY057
114  RWAY058
115  RWAY058

print (df.file.drop_duplicates())
0      RWAY001
2      RWAY002
112    RWAY057
114    RWAY058
Name: file, dtype: object

print (df.file.drop_duplicates().reset_index(drop=True))
0    RWAY001
1    RWAY002
2    RWAY057
3    RWAY058
Name: file, dtype: object

import pandas as pd

df = pd.DataFrame({'file': {0: 'RWAY001', 1: 'RWAY001', 2: 'RWAY002', 3: 'RWAY002', 115: 'RWAY058', 113: 'RWAY057', 112: 'RWAY057', 114: 'RWAY058'}})
print (df)
        file
0    RWAY001
1    RWAY001
2    RWAY002
3    RWAY002
112  RWAY057
113  RWAY057
114  RWAY058
115  RWAY058

print (df.file.drop_duplicates())
0      RWAY001
2      RWAY002
112    RWAY057
114    RWAY058
Name: file, dtype: object

print (df.file.drop_duplicates().reset_index(drop=True))
0    RWAY001
1    RWAY002
2    RWAY057
3    RWAY058
Name: file, dtype: object