Python 如何将pandas.agg（）与动态列名和多个函数一起使用？_Python_Pandas

Python 如何将pandas.agg（）与动态列名和多个函数一起使用？

python pandas

Python 如何将pandas.agg（）与动态列名和多个函数一起使用？,python,pandas,Python,Pandas,我一直在和熊猫玩耍，以便更熟悉。我找不到一种方法可以将pivot\u table（）或agg（）与动态列名（标签名称）和不同的函数（列表用于图书标签id和标签）。加入用于标签值）。我不知道如何将X替换为[COL\u NAME]\u book\u sticker\u id和Y替换为[COL\u NAME]\u sticker\u value，用于sticker\u NAME列中的每个组。任何帮助都将不胜感激输入 | book_id | book_sticker_id | sticker_n

我一直在和熊猫玩耍，以便更熟悉。我找不到一种方法可以将

pivot\u table（）

或

agg（）

与动态列名（

标签名称

）和不同的函数（

列表

用于

图书标签id

和

标签

）。加入

用于标签值
）。我不知道如何将X
替换为[COL\u NAME]\u book\u sticker\u id
和Y
替换为[COL\u NAME]\u sticker\u value
，用于sticker\u NAME
列中的每个组。任何帮助都将不胜感激
输入
| book_id  | book_sticker_id   | sticker_name  | sticker_value  |
| -------- | ----------------- | ------------- | -------------- |
| 1        | 1                 | label         | Value 1        |
| 1        | 2                 | label         | Value 2        |
| 1        | 3                 | label         | Value 3        |
| 1        | 4                 | label2        | Value 4        |
| 1        | 5                 | label2        | Value 5        |
| 1        | 6                 | label2        | Value 6        |
| 2        | 7                 | label         | Value 7        |
| 2        | 8                 | label         | Value 8        |
| 2        | 9                 | label         | Value 9        |
| 2        | 10                | label2        | Value 10       |
| 2        | 11                | label2        | Value 11       |
| 2        | 12                | label2        | Value 12       |
and so on...

df = df.groupby('book_sticker_id').agg(X=('book_sticker_id', list), Y=('sticker_value', '||'.join))


| book_id  | label_book_sticker_ids  | label_sticker_values         | label2_book_sticker_ids | label2_sticker_values           |
|----------|-------------------------|------------------------------|-------------------------|---------------------------------|
| 1        | [1,2,3]                 | 'Value 1||Value 2||Value 3'  | [4,5,6]                 | 'Value 4||Value 5||Value 6'     |
| 2        | [7,8,9]                 | 'Value 7||Value 8||Value 9'  | [10,11,12]              | 'Value 10||Value 11||Value 12'  |

and so on...

我的尝试
| book_id  | book_sticker_id   | sticker_name  | sticker_value  |
| -------- | ----------------- | ------------- | -------------- |
| 1        | 1                 | label         | Value 1        |
| 1        | 2                 | label         | Value 2        |
| 1        | 3                 | label         | Value 3        |
| 1        | 4                 | label2        | Value 4        |
| 1        | 5                 | label2        | Value 5        |
| 1        | 6                 | label2        | Value 6        |
| 2        | 7                 | label         | Value 7        |
| 2        | 8                 | label         | Value 8        |
| 2        | 9                 | label         | Value 9        |
| 2        | 10                | label2        | Value 10       |
| 2        | 11                | label2        | Value 11       |
| 2        | 12                | label2        | Value 12       |
and so on...

df = df.groupby('book_sticker_id').agg(X=('book_sticker_id', list), Y=('sticker_value', '||'.join))


| book_id  | label_book_sticker_ids  | label_sticker_values         | label2_book_sticker_ids | label2_sticker_values           |
|----------|-------------------------|------------------------------|-------------------------|---------------------------------|
| 1        | [1,2,3]                 | 'Value 1||Value 2||Value 3'  | [4,5,6]                 | 'Value 4||Value 5||Value 6'     |
| 2        | [7,8,9]                 | 'Value 7||Value 8||Value 9'  | [10,11,12]              | 'Value 10||Value 11||Value 12'  |

and so on...

所需输出
| book_id  | book_sticker_id   | sticker_name  | sticker_value  |
| -------- | ----------------- | ------------- | -------------- |
| 1        | 1                 | label         | Value 1        |
| 1        | 2                 | label         | Value 2        |
| 1        | 3                 | label         | Value 3        |
| 1        | 4                 | label2        | Value 4        |
| 1        | 5                 | label2        | Value 5        |
| 1        | 6                 | label2        | Value 6        |
| 2        | 7                 | label         | Value 7        |
| 2        | 8                 | label         | Value 8        |
| 2        | 9                 | label         | Value 9        |
| 2        | 10                | label2        | Value 10       |
| 2        | 11                | label2        | Value 11       |
| 2        | 12                | label2        | Value 12       |
and so on...

df = df.groupby('book_sticker_id').agg(X=('book_sticker_id', list), Y=('sticker_value', '||'.join))


| book_id  | label_book_sticker_ids  | label_sticker_values         | label2_book_sticker_ids | label2_sticker_values           |
|----------|-------------------------|------------------------------|-------------------------|---------------------------------|
| 1        | [1,2,3]                 | 'Value 1||Value 2||Value 3'  | [4,5,6]                 | 'Value 4||Value 5||Value 6'     |
| 2        | [7,8,9]                 | 'Value 7||Value 8||Value 9'  | [10,11,12]              | 'Value 10||Value 11||Value 12'  |

and so on...

尝试：
out = (df.groupby(['book_id','sticker_name'])
   .agg({'book_sticker_id':list,
        'sticker_value':'||'.join})
   .unstack()
   .sort_index(level=(1,0), axis=1)
)

out.columns = [f'{y}_{x}' for x,y in out.columns]

输出：
        label_book_sticker_id        label_sticker_value label2_book_sticker_id          label2_sticker_value
book_id                                                                                                      
1                   [1, 2, 3]  Value 1||Value 2||Value 3              [4, 5, 6]     Value 4||Value 5||Value 6
2                   [7, 8, 9]  Value 7||Value 8||Value 9           [10, 11, 12]  Value 10||Value 11||Value 12

尝试：
out = (df.groupby(['book_id','sticker_name'])
   .agg({'book_sticker_id':list,
        'sticker_value':'||'.join})
   .unstack()
   .sort_index(level=(1,0), axis=1)
)

out.columns = [f'{y}_{x}' for x,y in out.columns]

输出：
        label_book_sticker_id        label_sticker_value label2_book_sticker_id          label2_sticker_value
book_id                                                                                                      
1                   [1, 2, 3]  Value 1||Value 2||Value 3              [4, 5, 6]     Value 4||Value 5||Value 6
2                   [7, 8, 9]  Value 7||Value 8||Value 9           [10, 11, 12]  Value 10||Value 11||Value 12

将需要一些时间来了解发生了什么，但它似乎工作：）非常感谢！我在我的示例中犯了一个错误：我希望用一个字符串表示由“| |”分隔的值。有没有一种简单的方法可以让代码实现这一点<代码>['Value 1'、'Value 2'、'Value 3']

应该变成一个带有

Value 1 | | Value 2 | | Value 3

的字符串。我已经更新了

所需的输出

以澄清…@Konichawa只需使用

agg

字典，就像您最初尝试的那样。请参阅更新的答案。需要一些时间来了解发生了什么，但似乎有效：）非常感谢！我在我的示例中犯了一个错误：我希望用一个字符串表示由“| |”分隔的值。有没有一种简单的方法可以让代码实现这一点<代码>['Value 1'、'Value 2'、'Value 3']应该变成一个带有

Value 1 | | Value 2 | | Value 3

的字符串。我已经更新了

所需的输出

以澄清…@Konichawa只需使用

agg

字典，就像您最初尝试的那样。请参阅更新的答案。