Python 如何对dataframe的每一行进行排序，并根据行的排序值返回列索引_Python_Pandas_Sorting

Python 如何对dataframe的每一行进行排序，并根据行的排序值返回列索引

python pandas sorting

Python 如何对dataframe的每一行进行排序，并根据行的排序值返回列索引,python,pandas,sorting,Python,Pandas,Sorting,我试图对pandas dataframe的每一行进行排序，并在新的dataframe中获得排序值的索引。我可以慢慢来。有谁能建议对此使用并行化或矢量化代码进行改进。我在下面贴了一个例子数据url=“” 我想要的结果是 tag_0 tag_1 tag_2 tag_3 0 pop year gdpPercap lifeExp 1 pop year gdpPercap lifeExp 2 pop year gdpPercap lifeExp 在

我试图对pandas dataframe的每一行进行排序，并在新的dataframe中获得排序值的索引。我可以慢慢来。有谁能建议对此使用并行化或矢量化代码进行改进。我在下面贴了一个例子

数据url=“”

我想要的结果是

tag_0   tag_1   tag_2   tag_3
0   pop year    gdpPercap   lifeExp
1   pop year    gdpPercap   lifeExp
2   pop year    gdpPercap   lifeExp

在这种情况下，由于

pop

始终高于

gdpPercap

和

lifeExp

，因此它总是排在第一位

我可以使用以下代码实现所需的输出。但是如果

df

有很多行/列，则计算需要更长的时间

有人能建议对此进行改进吗

def sort_df(df):
    sorted_tags = pd.DataFrame(index = df.index, columns = ['tag_{}'.format(i) for i in range(df.shape[1])])
    for i in range(df.shape[0]):
        sorted_tags.iloc[i,:] = list( df.iloc[i, :].sort_values(ascending=False).index)
    return sorted_tags

sort_df(gapminder)

这可能与numpy的速度一样快：

def sort_df(df):
    return pd.DataFrame(
        data=df.columns.values[np.argsort(-df.values, axis=1)],
        columns=['tag_{}'.format(i) for i in range(df.shape[1])]
    )

print(sort_df(gapminder.head(3)))

  tag_0 tag_1      tag_2    tag_3
0   pop  year  gdpPercap  lifeExp
1   pop  year  gdpPercap  lifeExp
2   pop  year  gdpPercap  lifeExp

说明：

np.argsort

沿行对值进行排序，但返回对数组进行排序的索引，而不是可用于对数组进行共排序的排序值。负数按降序排序。在本例中，您使用索引对列进行排序。numpy广播负责返回正确的形状

对于您的示例，运行时间大约为3ms，而对于您的函数，运行时间大约为2.5s。

非常感谢

argsort

由于广播，有助于缩短时间。我以前不知道，如果我给出一个2D索引，那么单个数组（

df.columns

）可以多次生成

np.argsort（-df.values，axis=1）

def sort_df(df):
    return pd.DataFrame(
        data=df.columns.values[np.argsort(-df.values, axis=1)],
        columns=['tag_{}'.format(i) for i in range(df.shape[1])]
    )

print(sort_df(gapminder.head(3)))

  tag_0 tag_1      tag_2    tag_3
0   pop  year  gdpPercap  lifeExp
1   pop  year  gdpPercap  lifeExp
2   pop  year  gdpPercap  lifeExp