Python 数据操作-添加额外的索引列并按字母顺序排序索引_Python_Pandas_Sorting_Dataframe_Indexing

Python 数据操作-添加额外的索引列并按字母顺序排序索引

python pandas sorting dataframe indexing

Python 数据操作-添加额外的索引列并按字母顺序排序索引,python,pandas,sorting,dataframe,indexing,Python,Pandas,Sorting,Dataframe,Indexing,在我上一个问题中成功实现了索引操作之后，请参见下面的链接，我希望列按字母数字排序我想用一个附加/二级索引安排数据框-客户类别，并按字母顺序对每个类别中的客户名称进行排序我在考虑创建一个字典，将每个客户的名字映射到一个特定的类别，然后按字母顺序排序。不确定这是否有效或如何实现我想先按字母顺序对两个idx排序，然后按名称排序这是当前代码： df = df.pivot_table(index=['name'], columns=['Duration'],

在我上一个问题中成功实现了索引操作之后，请参见下面的链接，我希望列按字母数字排序

我想用一个附加/二级索引安排数据框-客户类别，并按字母顺序对每个类别中的客户名称进行排序

我在考虑创建一个字典，将每个客户的名字映射到一个特定的类别，然后按字母顺序排序。不确定这是否有效或如何实现

我想先按字母顺序对两个idx排序，然后按名称排序这是当前代码：

df = df.pivot_table(index=['name'], columns=['Duration'],
                                        aggfunc={'sum': np.sum}, fill_value=0)

# Sort Index Values - Duration
c = df_with_col_arg.columns.levels[1]
c = sorted(ns.natsorted(c), key=lambda x: not x.isdigit())

# Reindex Maturity values after Sorting
df_ = df.reindex_axis(pd.MultiIndex.from_product([df.columns.levels[0], c]), axis=1)

map_dict = {
            'Invoice A': 'A1. Retail',
            'Invoice B': 'A1. Retail',
            'Invoice Z': 'A1. Retail',
            'Invoice C': 'C1. Plastics',
            'Invoice F': 'C1. Plastics',
            'Invoice D': 'C2. Electronics',
            'Invoice J': 'C2. Electronics'
            }

# New Column - later to be converted to a secondary index
df['two_idx'] = df.index.to_series().map(map_dict)
df = df.sort_values(['two_idx'], ascending=[False]).set_index(['two_idx', 'name'])

df.columns的输出：

我想要的输出是：

Duration                            2          2Y         3         3Y   
two_idx           name                                                                     
A1. Retail        Invoice A      25.50        0.00      0.00       20.00   
A1. Retail        Invoice B      50.00        25.00     -10.50     0.00
C1. Plastics      Invoice C      125.00       0.00      11.20      0.50
C2. Electronics   Invoice D       0.00        15.00      0.00       80.10

我相信你需要：

Duration                            2          2Y         3         3Y   
two_idx           name                                                                     
A1. Retail        Invoice A      25.50        0.00      0.00       20.00   
A1. Retail        Invoice B      50.00        25.00     -10.50     0.00
C1. Plastics      Invoice C      125.00       0.00      11.20      0.50
C2. Electronics   Invoice D       0.00        15.00      0.00       80.10

import natsort as ns

#add parameter values for remove MultiIndex in columns
df = df.pivot_table(index='name', 
                    columns='Duration',
                    values='sum',
                    aggfunc='sum', 
                    fill_value=0)

#https://stackoverflow.com/a/47240142/2901002
c = sorted(ns.natsorted(df.columns), key=lambda x: not x.isdigit())
df = df.reindex(c, axis=1)

map_dict = {
            'Invoice A': 'A1. Retail',
            'Invoice B': 'A1. Retail',
            'Invoice Z': 'A1. Retail',
            'Invoice C': 'C1. Plastics',
            'Invoice F': 'C1. Plastics',
            'Invoice D': 'C2. Electronics',
            'Invoice J': 'C2. Electronics'
            }

#create new level of MultiIndex and assign back
df.index = pd.MultiIndex.from_arrays([df.rename(map_dict).index, 
                                      df.index], names=['name','one'])

#sorting index
df = df.sort_index()
print (df)
                               2     3    2Y    3Y
name            one                               
A1. Retail      Invoice A   25.5   0.0   0.0  20.0
                Invoice B   50.0 -10.5  25.0   0.0
C1. Plastics    Invoice C  125.0  11.2   0.0   0.5
C2. Electronics Invoice D    0.0   0.0  15.0  80.1