Python 保留元组的熊猫分组
我有一个看起来像这样的数据框架(实际上有35列和更多元组,但下面是相关列:Python 保留元组的熊猫分组,python,pandas,Python,Pandas,我有一个看起来像这样的数据框架(实际上有35列和更多元组,但下面是相关列: leg_side leg_quantity expiration product change_type 0 None None None ZQ inserted 1 None None None HG inserted 2 None Non
leg_side leg_quantity expiration product change_type
0 None None None ZQ inserted
1 None None None HG inserted
2 None None None PL inserted
3 None None None SI inserted
4 None None None ZQ inserted
5 None None None PL inserted
6 None None None ZW inserted
7 None None None SI inserted
8 None None None ZQ updated
9 None None None SI inserted
10 None None None ZC updated
.. ... ... ... ... ...
970 None None None OZ inserted
971 None None None OZ deleted
972 None None None OZ updated
973 None None None ZC inserted
974 None None None OZ inserted
975 None None None ZC inserted
976 None None None OZ inserted
现在我想做的是按产品分组,但不一定是SQL意义上的分组。我想做的是将所有具有类似产品的元组聚合在一起,然后按change_类型进行子聚合,以获得如下df:
leg_side leg_quantity expiration product change_type
0 None None None ZQ inserted
4 None None None ZQ inserted
8 None None None ZQ updated
1 None None None HG inserted
2 None None None PL inserted
5 None None None PL inserted
3 None None None SI inserted
7 None None None SI inserted
9 None None None SI inserted
6 None None None ZW inserted
...
973 None None None ZC inserted
975 None None None ZC inserted
10 None None None ZC updated
970 None None None OZ inserted
974 None None None OZ inserted
976 None None None OZ inserted
972 None None None OZ updated
971 None None None OZ deleted
上面的数据框架是这样组织的,即具有相同产品名称的所有元组都在一起,然后这些组中具有相同更改类型的所有元组都被分组在一起(最好按插入、更新、删除的顺序)然后元组将被消除。我只想了解分组排序的含义。我如何才能做到这一点?您可以使用分类和自定义顺序。然后使用排序:
df['change_type'] = df['change_type'].astype('category')
.cat
.set_categories(["inserted","updated","deleted"], ordered=True)
df = df.groupby('product').apply(lambda x: x.sort_values('change_type'))
.reset_index(drop=True)
print df
leg_side leg_quantity expiration product change_type
0 None None None HG inserted
1 None None None OZ inserted
2 None None None OZ inserted
3 None None None OZ inserted
4 None None None OZ updated
5 None None None OZ deleted
6 None None None PL inserted
7 None None None PL inserted
8 None None None SI inserted
9 None None None SI inserted
10 None None None SI inserted
11 None None None ZC inserted
12 None None None ZC inserted
13 None None None ZC updated
14 None None None ZQ inserted
15 None None None ZQ inserted
16 None None None ZQ updated
17 None None None ZW inserted