Python返回行中第二高值的列名
资料 预期结果Python返回行中第二高值的列名,python,pandas,dataframe,Python,Pandas,Dataframe,资料 预期结果 data = [['john', 0.20, 0.0, 0.4, 0.40],['katty', 0.0, 1.0, 0.0, 0.0],['kent', 0.0, 0.51, 0.49, 0.0]] df = pd.DataFrame(data, columns=['name','fruit', 'vegetable', 'softdrinks', 'icecream']) df = df.set_index('name') df.head() 尝试了idxmax,其中只返回
data = [['john', 0.20, 0.0, 0.4, 0.40],['katty', 0.0, 1.0, 0.0, 0.0],['kent', 0.0, 0.51, 0.49, 0.0]]
df = pd.DataFrame(data, columns=['name','fruit', 'vegetable', 'softdrinks', 'icecream'])
df = df.set_index('name')
df.head()
尝试了idxmax,其中只返回最高值的列名,我需要找到第二高的行值列名,如何实现这一点
非常感谢首先将
0
设置为缺少的值,然后根据进行重塑,对于top2使用,最后通过以下方式重塑数据:
或注释中的解决方案,并通过比较numpy
中的广播再次设置缺少的值:
df1 = df.mask(df == 0).stack().groupby(level=0, group_keys=False).nlargest(2).reset_index()
df1 = df1.assign(a = df1.groupby('name').cumcount().add(1))
df = df.join(df1.pivot('name','a','level_1').add_prefix('max_no'))
print (df)
fruit vegetable softdrinks icecream max_no1 max_no2
name
john 0.2 0.00 0.40 0.4 softdrinks icecream
katty 0.0 1.00 0.00 0.0 vegetable NaN
kent 0.0 0.51 0.49 0.0 vegetable softdrinks
删除最高的,然后再做一次?
df1 = df.mask(df == 0).stack().groupby(level=0, group_keys=False).nlargest(2).reset_index()
df1 = df1.assign(a = df1.groupby('name').cumcount().add(1))
df = df.join(df1.pivot('name','a','level_1').add_prefix('max_no'))
print (df)
fruit vegetable softdrinks icecream max_no1 max_no2
name
john 0.2 0.00 0.40 0.4 softdrinks icecream
katty 0.0 1.00 0.00 0.0 vegetable NaN
kent 0.0 0.51 0.49 0.0 vegetable softdrinks
df1 = df.mask(df == 0)
df['max_no1'] = df1.idxmax(axis=1)
m = df1.columns.to_numpy() == df['max_no1'].to_numpy()[:, None]
#pandas below 0.24
#m = df1.columns.values == df['max_no1'].values[:, None]
df1 = df1.mask(m)
df['max_no2'] = df1.idxmax(axis=1)
print (df)
fruit vegetable softdrinks icecream max_no1 max_no2
name
john 0.2 0.00 0.40 0.4 softdrinks icecream
katty 0.0 1.00 0.00 0.0 vegetable NaN
kent 0.0 0.51 0.49 0.0 vegetable softdrinks