Python 尝试使用groupby每月查找5个最大值_Python_Pandas_Group By

Python 尝试使用groupby每月查找5个最大值

python pandas

Python 尝试使用groupby每月查找5个最大值,python,pandas,group-by,Python,Pandas,Group By,我试图显示每个月nc\u type的前三个值。我试过使用n_最大的，但到了日期还不行原始数据： area nc_type occurred_date 0 Filling x 12/23/2015 0:00 1 Filling f

我试图显示每个月

nc\u type

的前三个值。我试过使用

n_最大的

，但到了日期还不行

原始数据：

     area                                     nc_type    occurred_date  
0     Filling                                 x          12/23/2015 0:00   
1     Filling                                 f          12/22/2015 0:00   
2     Filling                                 s          9/11/2015 0:00   
3     Filling                                 f          2/17/2016 0:00   
4     Filling                                 s          5/3/2016 0:00   
5     Filling                                 g          8/29/2016 0:00   
6     Filling                                 f          9/9/2016 0:00   
7     Filling                                 a          6/1/2016 0:00

转化为：

df.groupby([df.occurred_date.dt.month, "nc_type"])["rand"].count()

转换数据：

occurred_date  nc_type                                   
1              x                            3
               y                            4
               z                           13
               w                           24
               f                           34
                                           ..
12             d                           18
               g                           10
               w                           44
               a                           27
               g                           42

场景1
多指标系列

调用

sort_值

groupby

head

：

df.sort_values(ascending=False).groupby(level=0).head(2)

occurred_date  nc_type
12.0           w          44
               g          42
1.0            f          34
               w          24
Name: test, dtype: int64

df.sort_values(['occurred_date', 'value'], 
        ascending=[True, False]).groupby('occurred_date').head(2)

   occurred_date nc_type  value
4            1.0       f     34
3            1.0       w     24
7           12.0       w     44
9           12.0       g     42

根据您的情况将

标题（2）

更改为

标题（5）

或者，使用

nlargest

扩展my，您可以执行以下操作：

df.groupby(level=0).nlargest(2).reset_index(level=0, drop=1)

occurred_date  nc_type
1.0            f          34
               w          24
12.0           w          44
               g          42
Name: test, dtype: int64

场景2
3列数据帧

您可以使用

sort_值

groupby

head

：

df.sort_values(ascending=False).groupby(level=0).head(2)

occurred_date  nc_type
12.0           w          44
               g          42
1.0            f          34
               w          24
Name: test, dtype: int64

df.sort_values(['occurred_date', 'value'], 
        ascending=[True, False]).groupby('occurred_date').head(2)

   occurred_date nc_type  value
4            1.0       f     34
3            1.0       w     24
7           12.0       w     44
9           12.0       g     42

根据您的场景将

标题（2）

更改为

标题（5）

场景3
多索引数据帧

或者，使用

nlargest

df.groupby(level=0).test.nlargest(2)\
              .reset_index(level=0, drop=1)

occurred_date  nc_type
1.0            f          34
               w          24
12.0           w          44
               g          42
Name: test, dtype: int64

我会包括

group\u keys=False

df.groupby('occurred_date', group_keys=False).nlargest(3)

occurred_date  nc_type
1.0            f          34
               w          24
               z          13
12.0           w          44
               g          42
               a          27
Name: value, dtype: int64

df.groupby（“发生日期”）.nlargest（5）

？另外，最后一列的名称是什么？或者前两列是一个多索引？原始数据集有许多记录，其中有

发生日期

和

nc\u类型

关联，我使用groupby访问我当前的数据框架我们被迫对数据的结构进行假设。如果你能用我们的一个答案来证实，那就太好了。谢谢。添加了有关我原始数据的信息

df.groupby('occurred_date', group_keys=False).nlargest(3)

occurred_date  nc_type
1.0            f          34
               w          24
               z          13
12.0           w          44
               g          42
               a          27
Name: value, dtype: int64