Python 如何在保持索引结构的同时对多索引中的值进行排序_Python_Pandas_Multi Index

Python 如何在保持索引结构的同时对多索引中的值进行排序

python pandas

Python 如何在保持索引结构的同时对多索引中的值进行排序,python,pandas,multi-index,Python,Pandas,Multi Index,我希望对多索引数据帧的数据进行排序，同时保持较高级别的索引不变以下是数据示例：数据={ “第1列”：[1,2,3,4,34,2,5,6]， “Index1标题”：[ “苹果”、“苹果”、“小狗”、“小狗”， “橙子”、“橙子”、“蓝色浆果”、“蓝色浆果”]， “index2标题”：[ “内”、“外”、“内”、“外”， “内”、“外”、“内”、“外” ] } df=pd.DataFrame（数据） df.set_索引（['Index1 Title'，'index2 Title'，]，inpla

我希望对多索引数据帧的数据进行排序，同时保持较高级别的索引不变

以下是数据示例：

数据={
“第1列”：[1,2,3,4,34,2,5,6]，
“Index1标题”：[
“苹果”、“苹果”、“小狗”、“小狗”，
“橙子”、“橙子”、“蓝色浆果”、“蓝色浆果”]，
“index2标题”：[
“内”、“外”、“内”、“外”，
“内”、“外”、“内”、“外”
]
}
df=pd.DataFrame（数据）
df.set_索引（['Index1 Title'，'index2 Title'，]，inplace=True）

我得到这个输出：

Index1 Title index2 Title  Column 1
Apples       Inside         1.0
             Outside        2.0
Puppies      Inside         3.0
             Outside        4.0
Oranges      Inside         34.0
             Outside        2.0
Blue berries Inside         5.0
             Outside        6.0

当我尝试这段代码时：

df.sort\u值（'Column 1'，升序=False）

我明白了：

Index1 Title    index2 Title    Column 1
Oranges         Inside          34.0
Blue berries    Outside         6.0
                Inside          5.0
Puppies         Outside         4.0
                Inside          3.0
Apples          Outside         2.0
Oranges         Outside         2.0
Apples          Inside          1.0

我想得到的东西与使用Excel上的数据透视表得到的东西类似，因此：

Index1 Title    index2 Title    Column 1
Oranges         Inside          34
                Outside         2
Blue berries    Inside          5
                Outside         6
Puppies         Inside          3
                Outside         4
Apples          Inside          1
                Outside         2

基本上对每个Index1 Title值求和，并对它们进行排序，同时保持Index2 Title结构完整

我已经为此挣扎了一段时间，无法使用标准的

pandas

多索引找到修复方法。即使是

reset\u inde

sort\u index

例程也无法工作

for循环可以解决这个问题吗？

对值进行排序接受索引名，因此您可以执行以下操作：

df.sort_values(['Index1 Title', 'Column 1'], ascending=[True, False])

输出：

                           Column 1
Index1 Title index2 Title          
Apples       Outside            2.0
             Inside             1.0
Blue berries Outside            6.0
             Inside             5.0
Oranges      Inside            34.0
             Outside            2.0
Puppies      Outside            4.0
             Inside             3.0

更新事实上，我想我得到了你想要的。没有直接的方法可以做到这一点。您需要创建一个新系列，对其进行排序并重新为数据编制索引：

sorted_sum = (df.groupby(level=0).transform('sum')
                .sort_values(['Column 1', 'index2 Title'], 
                              ascending=[False,True], 
                              kind='mergesort')
             )

df.loc[sorted_sum.index]

输出：

                           Column 1
Index1 Title index2 Title          
Oranges      Inside            34.0
             Outside            2.0
Blue berries Inside             5.0
             Outside            6.0
Puppies      Inside             3.0
             Outside            4.0
Apples       Inside             1.0
             Outside            2.0

注意无法理解为什么第二级索引是反向的，即使使用

合并排序

，您也可以使用

.reindex

和

级别

参数来更改一个或多个

多重索引的级别，而不更改其他索引的顺序：
sorted_sums = df["Column 1"].sum(level=0).sort_values(ascending=False)
out = df.reindex(sorted_sums.index, level=0)

print(out)
                           Column 1
Index1 Title index2 Title
Oranges      Inside            34.0
             Outside            2.0
Blue berries Inside             5.0
             Outside            6.0
Puppies      Inside             3.0
             Outside            4.0
Apples       Inside             1.0
             Outside            2.0

另外-如果内存正常，则reindex
的level
参数中存在错误。我是在pandas版本1.1.1
上做的，所以我不能保证它能在早期版本上工作。
转换的使用很好！好主意！就我的工作而言，这是完美的。回归是次要的。可能是个虫子。非常感谢@QuangHoang，对代码进行了编辑，以返回预期的输出。如果您不同意，请回复。作为一个数据帧，您可以在同一时间对列和索引进行排序，因为我认为它可以正常工作。因此，我将保留@Quang-Hoang的答案。非常感谢您的解决方案。漂亮的一个！