Python 组内排序值_Python_Pandas

Python 组内排序值

python pandas

Python 组内排序值,python,pandas,Python,Pandas,假设我有这个数据帧： df = pd.DataFrame({ 'price': [2, 13, 24, 15, 11, 44], 'category': ["shirts", "pants", "shirts", "tops", "hat", "tops"], }) 我希望以如下方式对值进行排序：找出每个类别的最高价格根据最高价格对类别进行排序（在本例

假设我有这个数据帧：

df = pd.DataFrame({
    'price': [2, 13, 24, 15, 11, 44], 
    'category': ["shirts", "pants", "shirts", "tops", "hat", "tops"],
})

我希望以如下方式对值进行排序：

找出每个类别的最高价格
根据最高价格对类别进行排序（在本例中，按降序排列：上衣、衬衫、裤子、帽子）
根据较高的价格对每个类别进行排序

最后一个数据帧如下所示：

    price   category
0      44       tops
1      15       tops
2      24     shirts
3      24     shirts
4      13      pants
5      11        hat

您可以使用和：

df.join（df.groupby（“category”）.agg（“max”），on=“category”，rsuffix=“\u r”）。排序值(
[“价格”，“价格”]，升序=假
)

输出

   price category  price_r
5     44     tops       44
3     15     tops       44
2     24   shirts       24
0      2   shirts       24
1     13    pants       13
4     11      hat       11

我不太喜欢一行程序，所以我的解决方案是：

#为每个类别添加包含最高价格的列
df=df.merge（df.groupby（'category'）['price'].max（）。重命名（'max_cat_price'），
左上=类别，右索引=真）
#分类
df.sort_值（['category'，'price'，'max_cat_price'，升序=False）
#删除包含每个类别的最高价格的列
落差（'max_cat_price'，轴=1，原地=真）
打印（df）
价格类别
544件上衣
3件15件
24件衬衫
0.2件衬衫
1 13条裤子
4.11帽子

我在数据帧应用中使用了get_组来获取类别的最高价格

 df = pd.DataFrame({
'price': [2, 13, 24, 15, 11, 44], 
'category': ["shirts", "pants", "shirts", "tops", "hat", "tops"],
 })
 grouped=df.groupby('category')

 df['price_r']=df['category'].apply(lambda row: grouped.get_group(row).price.max())
 df=df.sort_values(['category','price','price_r'], ascending=False)
 print(df)

输出

    price category  price_r
 5     44     tops       44
 3     15     tops       44
 2     24   shirts       24
 0      2   shirts       24
 1     13    pants       13
 4     11      hat       11

如今，好的熊猫MRE是一种罕见的景象+1表示简单的最小数据帧代码

    price category  price_r
 5     44     tops       44
 3     15     tops       44
 2     24   shirts       24
 0      2   shirts       24
 1     13    pants       13
 4     11      hat       11