Python 3.x pandas多指标序列排序中的层次子集
我有一个三级多索引系列:Python 3.x pandas多指标序列排序中的层次子集,python-3.x,pandas,sorting,multi-index,Python 3.x,Pandas,Sorting,Multi Index,我有一个三级多索引系列: print(ser_test): Value Date Group Country 2014-01-31 3 AE example AR example 2014-02-28 3 AE example AR example 2014-03-31 3
print(ser_test):
Value
Date Group Country
2014-01-31 3 AE example
AR example
2014-02-28 3 AE example
AR example
2014-03-31 3 AE example
AR example
2014-04-30 3 AE example
AR example
2014-05-30 3 AR example
2014-06-30 2 AE example
3 AR example
2014-07-31 2 AE example
3 AR example
2014-08-29 2 AE example
3 AR example
2014-09-30 2 AE example
3 AR example
2014-10-31 2 AE example
3 AR example
2014-11-28 2 AE example
3 AR example
2014-12-31 2 AE example
3 AR example
我的目标是先按国家对系列进行排序,然后按日期,忽略组级别以获得下一个结果:
Value
Date Group Country
2014-01-31 3 AE example
2014-02-28 3 AE example
2014-03-31 3 AE example
2014-04-30 3 AE example
2014-06-30 2 AE example
2014-07-31 2 AE example
2014-08-29 2 AE example
2014-09-30 2 AE example
2014-10-31 2 AE example
2014-11-28 2 AE example
2014-12-31 2 AE example
2014-01-31 3 AR example
2014-02-28 3 AR example
2014-03-31 3 AR example
2014-04-30 3 AR example
2014-05-30 3 AR example
2014-06-30 3 AR example
2014-07-31 3 AR example
2014-08-29 3 AR example
2014-09-30 3 AR example
2014-10-31 3 AR example
2014-11-28 3 AR example
2014-12-31 3 AR example
我需要进一步的组级别,所以我不能简单地消除它
因此,我尝试使用如下排序索引方法:
print(ser_test.sort_index(level = ['Country', 'Date']))
或者像这样:
print(ser_test.sort_index(level = ['Country', 'Date'], sort_remaining = False))
在这两种情况下,我都收到了一个结果,组级别参与了排序过程,并且在日期级别之前具有优先级:
Value
Date Group Country
2014-06-30 2 AE example
2014-07-31 2 AE example
2014-08-29 2 AE example
2014-09-30 2 AE example
2014-10-31 2 AE example
2014-11-28 2 AE example
2014-12-31 2 AE example
2014-01-31 3 AE example
2014-02-28 3 AE example
2014-03-31 3 AE example
2014-04-30 3 AE example
2014-01-31 3 AR example
2014-02-28 3 AR example
2014-03-31 3 AR example
2014-04-30 3 AR example
2014-05-30 3 AR example
2014-06-30 3 AR example
2014-07-31 3 AR example
2014-08-29 3 AR example
2014-09-30 3 AR example
2014-10-31 3 AR example
2014-11-28 3 AR example
2014-12-31 3 AR example
我尝试使用sort_index的所有选项,并通过这段代码获得了意想不到的成功:
print(ser_test.sort_index(level = ['Country', 'Date'], ascending = [True, True]))
Value
Date Group Country
2014-01-31 3 AE example
2014-02-28 3 AE example
2014-03-31 3 AE example
2014-04-30 3 AE example
2014-06-30 2 AE example
2014-07-31 2 AE example
2014-08-29 2 AE example
2014-09-30 2 AE example
2014-10-31 2 AE example
2014-11-28 2 AE example
2014-12-31 2 AE example
2014-01-31 3 AR example
2014-02-28 3 AR example
2014-03-31 3 AR example
2014-04-30 3 AR example
2014-05-30 3 AR example
2014-06-30 3 AR example
2014-07-31 3 AR example
2014-08-29 3 AR example
2014-09-30 3 AR example
2014-10-31 3 AR example
2014-11-28 3 AR example
2014-12-31 3 AR example
这很奇怪,我不确定这是否是一种获得有保证的预期排序结果的通用方法,而使用多索引对我来说是一个关键的选择
那么,您能否帮助我理解排序索引原则,并向我分享此特定情况下的一段代码?您可以尝试升级到pandas的最新版本,在pandas 0.25.0中测试,运行良好:
print(df.sort_index(level = ['Country', 'Date']))
Value
Date Group Country
2014-01-31 3 AE example
2014-02-28 3 AE example
2014-03-31 3 AE example
2014-04-30 3 AE example
2014-06-30 2 AE example
2014-07-31 2 AE example
2014-08-29 2 AE example
2014-09-30 2 AE example
2014-10-31 2 AE example
2014-11-28 2 AE example
2014-12-31 2 AE example
2014-01-31 3 AR example
2014-02-28 3 AR example
2014-03-31 3 AR example
2014-04-30 3 AR example
2014-05-30 3 AR example
2014-06-30 3 AR example
2014-07-31 3 AR example
2014-08-29 3 AR example
2014-09-30 3 AR example
2014-10-31 3 AR example
2014-11-28 3 AR example
2014-12-31 3 AR example
非常感谢,@jezrael!我已经将pandas从0.24.2更新到了0.25.1,sort_索引现在正按照我的期望工作!