Python 根据dataframe中的特定id更改列的值_Python_Pandas_Dataframe

Python 根据dataframe中的特定id更改列的值

python pandas dataframe

Python 根据dataframe中的特定id更改列的值,python,pandas,dataframe,Python,Pandas,Dataframe,我有下面已排序的数据帧，我想将id列中每个id的最后一个值设置为0 id value 1 500 1 50 1 36 2 45 2 150 2 70 2 20 2 10 我可以使用df['value']将整个id列的最后一个值设置为0。iloc[-1]=0。如何设置id:1和id:2的最后一个值以获得以下输出 id value 1 500 1 50 1 0 2 45 2 150 2 70 2 20 2 0 您可以执行并保留las

我有下面已排序的数据帧，我想将id列中每个id的最后一个值设置为0

id value 1 500 1 50 1 36 2 45 2 150 2 70 2 20 2 10
我可以使用
df['value']将整个id列的最后一个值设置为0。iloc[-1]=0
。如何设置id:1和id:2的最后一个值以获得以下输出

id value 1 500 1 50 1 0 2 45 2 150 2 70 2 20 2 0
您可以执行并保留last以获取每个id的最后一行。使用这些行的
索引
，并将值设置为0

df.loc[df['id'].drop_duplicates(keep='last').index, 'value'] = 0 print(df) id value 0 1 500 1 1 50 2 1 0 3 2 45 4 2 150 5 2 70 6 2 20 7 2 0
细分

m=df.id.duplicated('last') df.loc[~m,'value']=0 id value 0 1 500 1 1 50 2 1 0 3 2 45 4 2 150 5 2 70 6 2 20 7 2 0
它的工作原理

m=df.id.duplicated('last')# Selects the last duplicated in column id ~m reverses that and hence last duplicated becomes true df.loc[~m,'value']# loc accessor allows us to reach the True value in the nominated column to write with 0

如果您愿意使用
numpy
，这里有一个快速解决方案：

import numpy as np # Recreate example df = pd.DataFrame({ "id":[1,1,1,2,2,2,2,2], "value": [500,50,36,45,150,70,20,10] }) # Solution df["value"] = np.where(~df.id.duplicated(keep="last"),0,df["value"].to_numpy())

如果id:1>100和id:2<100，我还想将值设置为二进制1；如果id:1<101和id:2>99，我想将值设置为二进制0。我已尝试添加一个新的列阈值。是否有更有效的解决方案？如何在不添加新列的情况下实现这一点。rindx=df[（（df['id']==1）和（df['value']>100））|（（df['id'==2）和（df['value']<100））。索引df.loc[rindx，'threshold']=1 rindx=df[（df['id']=1）和（df['value']<101））|（（df['id']=2）和（df['value']>99））。索引df.loc[rindx，'threshold']=0预期输出：
id值11 10 10 21 20 21 21 20 import numpy as np # Recreate example df = pd.DataFrame({ "id":[1,1,1,2,2,2,2,2], "value": [500,50,36,45,150,70,20,10] }) # Solution df["value"] = np.where(~df.id.duplicated(keep="last"),0,df["value"].to_numpy())