Python 保留元组的熊猫分组

Python 保留元组的熊猫分组,python,pandas,Python,Pandas,我有一个看起来像这样的数据框架(实际上有35列和更多元组,但下面是相关列: leg_side leg_quantity expiration product change_type 0 None None None ZQ inserted 1 None None None HG inserted 2 None Non

我有一个看起来像这样的数据框架(实际上有35列和更多元组,但下面是相关列:

     leg_side  leg_quantity expiration product  change_type  
0        None          None       None      ZQ     inserted  
1        None          None       None      HG     inserted  
2        None          None       None      PL     inserted  
3        None          None       None      SI     inserted  
4        None          None       None      ZQ     inserted  
5        None          None       None      PL     inserted  
6        None          None       None      ZW     inserted  
7        None          None       None      SI     inserted  
8        None          None       None      ZQ     updated  
9        None          None       None      SI     inserted  
10       None          None       None      ZC     updated
..        ...           ...        ...     ...          ...  
970      None          None       None      OZ     inserted  
971      None          None       None      OZ     deleted  
972      None          None       None      OZ     updated  
973      None          None       None      ZC     inserted  
974      None          None       None      OZ     inserted  
975      None          None       None      ZC     inserted  
976      None          None       None      OZ     inserted
现在我想做的是按产品分组,但不一定是SQL意义上的分组。我想做的是将所有具有类似产品的元组聚合在一起,然后按change_类型进行子聚合,以获得如下df:

     leg_side  leg_quantity expiration product  change_type  
0        None          None       None      ZQ     inserted
4        None          None       None      ZQ     inserted
8        None          None       None      ZQ     updated 
1        None          None       None      HG     inserted
2        None          None       None      PL     inserted
5        None          None       None      PL     inserted
3        None          None       None      SI     inserted
7        None          None       None      SI     inserted
9        None          None       None      SI     inserted
6        None          None       None      ZW     inserted
...
973      None          None       None      ZC     inserted
975      None          None       None      ZC     inserted
10       None          None       None      ZC     updated
970      None          None       None      OZ     inserted
974      None          None       None      OZ     inserted
976      None          None       None      OZ     inserted
972      None          None       None      OZ     updated
971      None          None       None      OZ     deleted

上面的数据框架是这样组织的,即具有相同产品名称的所有元组都在一起,然后这些组中具有相同更改类型的所有元组都被分组在一起(最好按插入、更新、删除的顺序)然后元组将被消除。我只想了解分组排序的含义。我如何才能做到这一点?

您可以使用分类和自定义顺序。然后使用排序:

df['change_type'] = df['change_type'].astype('category')
                                     .cat
                                     .set_categories(["inserted","updated","deleted"], ordered=True)

df = df.groupby('product').apply(lambda x: x.sort_values('change_type'))
                          .reset_index(drop=True)
print df

   leg_side leg_quantity expiration product change_type
0      None         None       None      HG    inserted
1      None         None       None      OZ    inserted
2      None         None       None      OZ    inserted
3      None         None       None      OZ    inserted
4      None         None       None      OZ     updated
5      None         None       None      OZ     deleted
6      None         None       None      PL    inserted
7      None         None       None      PL    inserted
8      None         None       None      SI    inserted
9      None         None       None      SI    inserted
10     None         None       None      SI    inserted
11     None         None       None      ZC    inserted
12     None         None       None      ZC    inserted
13     None         None       None      ZC     updated
14     None         None       None      ZQ    inserted
15     None         None       None      ZQ    inserted
16     None         None       None      ZQ     updated
17     None         None       None      ZW    inserted