Python 在dataframe中的每一行中，获取top-n值及其所在列的名称_Python_Pandas_Dataframe_Top N

Python 在dataframe中的每一行中，获取top-n值及其所在列的名称

python pandas dataframe

Python 在dataframe中的每一行中，获取top-n值及其所在列的名称,python,pandas,dataframe,top-n,Python,Pandas,Dataframe,Top N,我有这样一个数据帧： df = pd.DataFrame({'a':[1,2,1],'b':[4,6,0],'c':[0,4,8]}) +---+---+---+ | a | b | c | +---+---+---+ | 1 | 4 | 0 | +---+---+---+ | 2 | 6 | 4 | +---+---+---+ | 1 | 0 | 8 | +---+---+---+ 对于每一行，我都需要（两者））n（在本例中为两个）最高值和降序排列的相应列： row 1: 'b':4,'a'

我有这样一个数据帧：

df = pd.DataFrame({'a':[1,2,1],'b':[4,6,0],'c':[0,4,8]})
+---+---+---+
| a | b | c |
+---+---+---+
| 1 | 4 | 0 |
+---+---+---+
| 2 | 6 | 4 |
+---+---+---+
| 1 | 0 | 8 |
+---+---+---+

对于每一行，我都需要（两者））n（在本例中为两个）最高值和降序排列的相应列：

row 1: 'b':4,'a':1
row 2: 'b':6,'c':4
row 3: 'c':8,'a':1

这里有两种方法，都是从

1）使用Python decoration Sort Undecorate和
。在每一行上应用（lambda…
，插入列名，执行np.argsort，保留top-n，重新格式化答案。（我认为这更干净）

2）获取如下所示的
topnlocs
矩阵，然后使用它将其重新索引为df.columns和df.values，并将该输出组合起来：

import numpy as np

nlargest = 2
topnlocs = np.argsort(-df.values, axis=1)[:, 0:nlargest]
# ... now you can use topnlocs to reindex both into df.columns, and df.values, then reformat/combine them somehow
# however it's painful trying to apply that NumPy array of indices back to df or df.values,

请参见这里有两种方法，都是从

1）使用Python decoration Sort Undecorate和
。在每一行上应用（lambda…
，插入列名，执行np.argsort，保留top-n，重新格式化答案。（我认为这更干净）

2）获取如下所示的
topnlocs
矩阵，然后使用它将其重新索引为df.columns和df.values，并将该输出组合起来：

import numpy as np

nlargest = 2
topnlocs = np.argsort(-df.values, axis=1)[:, 0:nlargest]
# ... now you can use topnlocs to reindex both into df.columns, and df.values, then reformat/combine them somehow
# however it's painful trying to apply that NumPy array of indices back to df or df.values,

请参见

是否保证有三列，它们的名称是

a、b、c

，或者您想要一个一般答案？我想要一个一般答案，只是为了简单起见使用了三列，实际上它是

nlargest=2

的完美副本。这是你的答案。（我现在无法重定向我的投票结果。）好的，我现在看到你写了“对于每一行，我需要（编辑：两个）前n个值和相应的列以降序排列”。是的，对不起，这有点不同。是否保证有三列，它们的名称是

a，b，c

，或者你想要一个一般的答案？我想要一个一般的答案，只是为了简单起见使用了三列，实际上它是

nlargest=2

的完美副本。这是你的答案。（我现在无法重定向我的投票结果。）好的，我现在看到你写了“对于每一行，我需要（编辑：两个）前n个值和相应的列以降序排列”。是的，对不起，这有点不同。所以我多跑了五英里，给了你工作代码。这是相当痛苦的。选项1）听起来不那么刺耳，但效果更好。这正是我想要的，除了1）

ix

已被弃用，2）将该行更改为使用

iloc

会产生

太多索引器。我用的是pandas 0.25.3，所以我多跑了五英里，给了你工作代码。这是相当痛苦的。选项1）听起来不那么刺耳，但效果更好。这正是我想要的，除了1）ix
已被弃用，2）将该行更改为使用iloc
会产生太多索引器。我用的是熊猫0.25.3。