Python groupby-根据其他列中的条件更改列值
我想先按“组”列分组。 然后根据“结果”列和“排名”列中的条件更改“结果”列中的值 这就是我现在拥有的:Python groupby-根据其他列中的条件更改列值,python,python-3.x,pandas,dataframe,pandas-groupby,Python,Python 3.x,Pandas,Dataframe,Pandas Groupby,我想先按“组”列分组。 然后根据“结果”列和“排名”列中的条件更改“结果”列中的值 这就是我现在拥有的: import pandas as pd import numpy as np group = ['g1','g1','g1','g1','g1','g2','g2','g2','g2','g2','g2'] rank = ['1','2','3','4','5','1','2','3','4','5','6'] result = ['1','4','2','4','4','1','4','
import pandas as pd
import numpy as np
group = ['g1','g1','g1','g1','g1','g2','g2','g2','g2','g2','g2']
rank = ['1','2','3','4','5','1','2','3','4','5','6']
result = ['1','4','2','4','4','1','4','4','2','4','4']
df = pd.DataFrame({"group": group, "rank": rank, "result": result})
group rank result
0 g1 1 1
1 g1 2 4
2 g1 3 2
3 g1 4 4
4 g1 5 4
5 g2 1 1
6 g2 2 4
7 g2 3 4
8 g2 4 2
9 g2 5 4
10 g2 6 4
在同一组中,当秩大于result=2的秩时,我想将结果从4更改为6例如:在g1中,result=2的排名是3。因此,排名4和5的结果将是6。
在g2中,result=2的秩为4。因此排名5和6的结果将是6。
在这种情况下,我希望的输出是:
group rank result
0 g1 1 1
1 g1 2 4
2 g1 3 2
3 g1 4 6
4 g1 5 6
5 g2 1 1
6 g2 2 4
7 g2 3 4
8 g2 4 2
9 g2 5 6
10 g2 6 6
我不知道实现这一目标的最佳方法。有人能帮忙吗
提前谢谢 用于将结果中与2
匹配的行的rank
替换为NaN
,然后用于每组的重复值by,最后一次比较较大的by和设置值6
:
这样就行了
import pandas as pd
import numpy as np
group = ['g1','g1','g1','g1','g1','g2','g2','g2','g2','g2','g2']
rank = ['1','2','3','4','5','1','2','3','4','5','6']
result = ['1','4','2','4','4','1','4','4','2','4','4']
df = pd.DataFrame({"group": group, "rank": rank, "result": result})
def changeDf(x):
df_gp = df[df['group'] == x['group']]
rank_of_2 = df_gp.loc[df_gp['result'] =='2', 'rank'].values[0]
if int(x['rank']) > int(rank_of_2):
return '6'
else:
return x['result']
df['result'] = df.apply(changeDf, axis=1)
print(df)
如果不匹配“2”,此操作将失败,并且无法更正比较字符串(如
10
)。因为例如'5'>'10'
import pandas as pd
import numpy as np
group = ['g1','g1','g1','g1','g1','g2','g2','g2','g2','g2','g2']
rank = ['1','2','3','4','5','1','2','3','4','5','6']
result = ['1','4','2','4','4','1','4','4','2','4','4']
df = pd.DataFrame({"group": group, "rank": rank, "result": result})
def changeDf(x):
df_gp = df[df['group'] == x['group']]
rank_of_2 = df_gp.loc[df_gp['result'] =='2', 'rank'].values[0]
if int(x['rank']) > int(rank_of_2):
return '6'
else:
return x['result']
df['result'] = df.apply(changeDf, axis=1)
print(df)