Python 从pandas中其他列的集合创建新列_Python_Pandas

Python 从pandas中其他列的集合创建新列

python pandas

Python 从pandas中其他列的集合创建新列,python,pandas,Python,Pandas,我有以下数据帧： col1 col2 col3 0 tom 2 cash 1 tom 3 gas 2 tom 5 online 3 jerry 1 online 4 jerry 4 online 5 jerry 5 gas 6 scooby 8 cash 7 scooby 6 dogfood 8 scooby 1 cheese 可通过以下方式轻松获得

我有以下数据帧：

   col1    col2  col3
0   tom     2    cash
1   tom     3    gas
2   tom     5    online
3   jerry   1    online
4   jerry   4    online
5   jerry   5    gas
6   scooby  8    cash
7   scooby  6    dogfood
8   scooby  1    cheese

可通过以下方式轻松获得：

data = {'col1': ['tom', 'tom', 'tom', 'jerry', 'jerry', 'jerry', 'scooby', 'scooby', 'scooby'],
'col2': [2,3,5,1,4,5,8,6,1],
'col3':['cash', 'gas', 'online', 'online', 'online', 'gas', 'cash', 'dogfood', 'cheese']}

pd.DataFrame(data)

如何将数据按

col1

分组，然后作为额外的列，为

col3

的指定值获取特定的聚合

例如，假设我想按

col1

分组，得到

col1

中每个人的

gas

、

cash

和

online

字段的总和，如下所示

col1    gas_sum    cash_sum    online_sum
tom        3          2             5
jerry      5          0             5
scooby     0          8             0

我对pandas比较陌生，我能想到的唯一方法是在所有数据中使用for循环，因为在我的示例中，

groupby

的目的更多的是给出像

col2

这样的列的总和/平均值

谢谢你的帮助

IIUC

我们可以将

isin

groupby

和

unstack

df1 = df.loc[df["col3"].isin(["gas", "online", "cash"])].groupby(["col1", "col3"])[
    "col2"
].sum().unstack().fillna(0)

df1.columns = df1.columns.map(lambda x : x + '_sum')

df1.columns.name = ''

print(df1)

        cash_sum  gas_sum  online_sum
col1                                 
jerry        0.0      5.0         5.0
scooby       8.0      0.0         0.0
tom          2.0      3.0         5.0

另一种使用方法。我们还将使用仅获取您感兴趣的值并更改列名：

# Values to sum
values = ['cash', 'gas', 'online']

df_out = (df.pivot_table(index='col1', columns='col3',
                         values='col2', aggfunc='sum',
                         fill_value=0)
 .reindex(columns=values, fill_value=0)
 .add_suffix('_sum'))

[外]

太神了谢谢，先生！遗憾的是，我没有足够的声誉来支持你的答案。@Joram没问题，你也可以筛选并使用

交叉表

或

透视

功能，这些功能在我看来更容易阅读。祝你好运：）

col3    cash_sum  gas_sum  online_sum
col1                                 
jerry          0        5           5
scooby         8        0           0
tom            2        3           5