Python 基于类别的聚合值
我有一个数据帧,如:Python 基于类别的聚合值,python,pandas,aggregate,categories,Python,Pandas,Aggregate,Categories,我有一个数据帧,如: SK_ID_CURR CREDIT_ACTIVE CREDIT_DAY_OVERDUE 436084 Sold 0 436084 Active 951 436084 Sold 0 436084 Active 0 436084 Bad debt 0 43608
SK_ID_CURR CREDIT_ACTIVE CREDIT_DAY_OVERDUE
436084 Sold 0
436084 Active 951
436084 Sold 0
436084 Active 0
436084 Bad debt 0
436084 Active 936
436084 Active 951
我想为每个CREDIT_活动类别创建新的列,其中包含对应的CREDIT_DAY_逾期值的总和
结果应该如下所示:
SK_ID_CURR CREDIT_ACTIVE_OD CREDIT_BAD_DEBT_OD CREDIT_ACTIVE_SOLD_OD
436084 2838 0 0
使用并聚合总和
,最后通过以下方式重塑形状:
或使用:
然后更改列名称:
df.columns = ['CREDIT_{}_OD'.format(x.upper()) for x in df.columns]
最后从索引创建列:
df = df.reset_index()
print (df)
SK_ID_CURR CREDIT_ACTIVE_OD CREDIT_BAD DEBT_OD CREDIT_SOLD_OD
0 436084 2838 0 0
使用:
df.columns = ['CREDIT_{}_OD'.format(x.upper()) for x in df.columns]
df = df.reset_index()
print (df)
SK_ID_CURR CREDIT_ACTIVE_OD CREDIT_BAD DEBT_OD CREDIT_SOLD_OD
0 436084 2838 0 0
res = pd.pivot_table(df, index='SK_ID_CURR', columns='CREDIT_ACTIVE',
values='CREDIT_DAY_OVERDUE', aggfunc='sum')
print(res)
CREDIT_ACTIVE Active BadDebt Sold
SK_ID_CURR
436084 2838 0 0