Python 删除不同列中的重复值_Python_Pandas

Python 删除不同列中的重复值

python pandas

Python 删除不同列中的重复值,python,pandas,Python,Pandas,我有以下数据帧： >>>Feature name error1 error2 error3 error4 0 1 A overlaps overlaps overlaps overlaps 1 2 B No error 2 3 C overlaps invalid invalid 3 4 D invalid overlaps ove

我有以下数据帧：

>>>Feature name   error1    error2    error3   error4
0     1     A      overlaps  overlaps  overlaps overlaps
1     2     B       No error 
2     3     C       overlaps  invalid   invalid  
3     4     D     invalid   overlaps  overlaps

我只希望每行有唯一的错误，例如：

>>>Feature Name   error1    error2    error3   error4
0     1      A    overlaps  
1     2      B    No error 
2     3      C    overlaps  invalid     
3     4      D    invalid   overlaps

有什么简单的方法可以做到这一点吗？我想也许可以计算每行中每个值出现的次数，但我不确定如何删除它们

想法是从

错误

列中删除重复项，添加以添加可能删除的列，然后重新分配：

cols = df.filter(like='error').columns
df[cols] = (df[cols].apply(lambda x: pd.Series(x.unique()), axis=1)
                    .reindex(np.arange(len(cols)), axis=1))
print (df)
   Feature name    error1    error2  error3  error4
0        1    A  overlaps       NaN     NaN     NaN
1        2    B        No     error     NaN     NaN
2        3    C  overlaps   invalid     NaN     NaN
3        4    D   invalid  overlaps     NaN     NaN

想法是从

error

列中删除重复项，为添加可能删除的列添加，然后重新分配：

cols = df.filter(like='error').columns
df[cols] = (df[cols].apply(lambda x: pd.Series(x.unique()), axis=1)
                    .reindex(np.arange(len(cols)), axis=1))
print (df)
   Feature name    error1    error2  error3  error4
0        1    A  overlaps       NaN     NaN     NaN
1        2    B        No     error     NaN     NaN
2        3    C  overlaps   invalid     NaN     NaN
3        4    D   invalid  overlaps     NaN     NaN

试一试

out = pd.DataFrame(list(map(pd.unique, df.loc[:,'error1':].values)),index=df.Feature)
Out[333]: 
                0         1     2
Feature                          
1        overlaps      None  None
2              No     error  None
3        overlaps   invalid  None
4         invalid  overlaps  None

试一试

out = pd.DataFrame(list(map(pd.unique, df.loc[:,'error1':].values)),index=df.Feature)
Out[333]: 
                0         1     2
Feature                          
1        overlaps      None  None
2              No     error  None
3        overlaps   invalid  None
4         invalid  overlaps  None

这两个人创建了与我开始时相同的表格我编辑了我的问题-错误和特征旁边有更多的列这两个人创建了与我开始时相同的表格我编辑了我的问题-错误和特征旁边有更多的列