Python 在Pandas中使用groupby按列值获取前3行_Python_Pandas_Pandas Groupby

Python 在Pandas中使用groupby按列值获取前3行

python pandas

Python 在Pandas中使用groupby按列值获取前3行,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,我有这个数据框： person_code type growth size ... 0 . 231 32 0.54 32 1 . 233 43 0.12 333 2 . 432 32 0.44 21 3 . 431 56 0.32 23 4 . 654 89 0.12 89 5 .

我有这个数据框：

    person_code  type   growth   size  ...
0 .         231    32     0.54     32
1 .         233    43     0.12    333
2 .         432    32     0.44     21
3 .         431    56     0.32     23
4 .         654    89     0.12     89
5 .         764    32     0.20    211
6 .         434    32     0.82     90
...

（这个数据框相当大，我在这里做了一个简化）

我想为每种类型创建一个数据帧，其中有3个人的“增长率”更高，由it订购。我希望能够按类型调用它。在本例中，让我们使用类型32，因此输出df应该如下所示：

    person_code  type   growth   size  ...
6 .         434    32     0.82     90
0 .         231    32     0.54     32
2 .         432    32     0.44     21
...

type_group_df.query('type == "brazilian"')

我知道这将是使用groupby实现的：

groups=dataframe.groupby('type')

但是，我如何使用类型为32的行调用groupby对象呢？

按增长率划分前三名的最佳方法是什么？

IIUC，您不需要groupby，只需查询即可过滤数据帧，然后：

并且，要参数化“类型”输入，可以使用以下语法：

in_type = 32

df.query('type == @in_type').nlargest(3, 'growth')

输出：

     person_code  type  growth  size
6 .          434    32    0.82    90
0 .          231    32    0.54    32
2 .          432    32    0.44    21

     person_code  type  growth  size
6 .          434    32    0.82    90
0 .          231    32    0.54    32
2 .          432    32    0.44    21

     person_code  type  growth  size
1 .          233    43    0.12   333

或者，如果要使用groupby，可以使用query仅获取所需的类型

type_group_df = df.groupby('type', group_keys=False)\
                  .apply(pd.DataFrame.nlargest,n=3,columns='growth')

要调用它，您可以使用：

type_group_df.query('type == 32')

如果您将字符串作为类型，它将如下所示：

    person_code  type   growth   size  ...
6 .         434    32     0.82     90
0 .         231    32     0.54     32
2 .         432    32     0.44     21
...

type_group_df.query('type == "brazilian"')

但是，如果列名以特殊字符开头，例如“#”，则应使用以下字符：

type_group_df[type_group_df['#type'] == 32]

输出：

     person_code  type  growth  size
6 .          434    32    0.82    90
0 .          231    32    0.54    32
2 .          432    32    0.44    21

     person_code  type  growth  size
6 .          434    32    0.82    90
0 .          231    32    0.54    32
2 .          432    32    0.44    21

     person_code  type  growth  size
1 .          233    43    0.12   333

查询其他类型（43）：

输出：

     person_code  type  growth  size
6 .          434    32    0.82    90
0 .          231    32    0.54    32
2 .          432    32    0.44    21

     person_code  type  growth  size
6 .          434    32    0.82    90
0 .          231    32    0.54    32
2 .          432    32    0.44    21

     person_code  type  growth  size
1 .          233    43    0.12   333

IIUC，您不需要groupby，只需

query

过滤数据帧即可：