Python 如何在pandas中比较数据帧中的行_Python_Pandas

Python 如何在pandas中比较数据帧中的行

python pandas

Python 如何在pandas中比较数据帧中的行,python,pandas,Python,Pandas,我希望能够比较ID号相同的两行（例如第0行和第1行），然后删除绝对收入较小的行。是否有任何方法可以只使用pandas函数而不使用.itertuples（）在行中循环。我正在考虑使用.shift和.apply，但我不确定如何执行 Index ID Income 0 2011000070 55019 1 2011000070 0 2 2011000074 23879

我希望能够比较ID号相同的两行（例如第0行和第1行），然后删除绝对收入较小的行。是否有任何方法可以只使用pandas函数而不使用.itertuples（）在行中循环。我正在考虑使用.shift和.apply，但我不确定如何执行

 Index   ID             Income  
 0       2011000070      55019   
 1       2011000070          0   
 2       2011000074      23879   
 3       2011000074          0   
 4       2011000078          0   
 5       2011000078          0   
 6       2011000118     -32500   
 7       2011000118          0

我想要的输出：

 Index   ID             Income  
 0       2011000070      55019     
 2       2011000074      23879     
 4       2011000078          0     
 6       2011000118     -32500

按

ID

和

Income

的绝对值使用plus排序应该可以解决您的问题。它的

keep

参数默认为

“first”

，这就是您想要的

df['Income_abs'] = df['Income'].apply(abs)

df.sort_values(['ID', 'Income_abs'], ascending=[True,False]).drop_duplicates(['ID']).drop('Income_abs',axis=1)
Out[26]: 
   Index          ID  Income
0      0  2011000070   55019
2      2  2011000074   23879
4      4  2011000078       0
6      6  2011000118  -32500

您需要使用以获取最大绝对值的索引，然后通过以下方式选择行：

替代解决方案：

df = df.loc[df['Income'].abs().groupby(df['ID']).idxmax()]
print (df)
   Index          ID  Income
0      0  2011000070   55019
2      2  2011000074   23879
4      4  2011000078       0
6      6  2011000118  -32500

这可能行得通

In [458]: df.groupby('ID', as_index=False).apply(lambda x: x.ix[x.Income.abs().idxmax()])
Out[458]:
   Index          ID  Income
0      0  2011000070   55019
1      2  2011000074   23879
2      4  2011000078       0
3      6  2011000118  -32500

In [458]: df.groupby('ID', as_index=False).apply(lambda x: x.ix[x.Income.abs().idxmax()])
Out[458]:
   Index          ID  Income
0      0  2011000070   55019
1      2  2011000074   23879
2      4  2011000078       0
3      6  2011000118  -32500