使用pandas groupby求和值,并重命名旧列?;
如下面代码所示,我想按使用pandas groupby求和值,并重命名旧列?;,pandas,python,dataframe,Pandas,Python,Dataframe,如下面代码所示,我想按帐户id对数据进行分组,然后将系统值相加,并将其重命名为总值,同时保留每个日期的数据 s=[ {'account_id':'1166470734','entity':'entity1','system_value':10.2','date':'2010-01-02','sale':'sale1'}, {'account_id':'1166470734','entity':'entity1','system_value':2.2','date':'2010-01-03','s
帐户id
对数据进行分组,然后将系统值
相加,并将其重命名为总值
,同时保留每个日期的数据
s=[
{'account_id':'1166470734','entity':'entity1','system_value':10.2','date':'2010-01-02','sale':'sale1'},
{'account_id':'1166470734','entity':'entity1','system_value':2.2','date':'2010-01-03','sale':'sale1'},
{'account_id':'123232323','entity':'entity2','system_value':4.2',date':'2010-01-03','sale':'sale2'},
{'account_id':'123232323','entity':'entity2','system_value':5.2',date':'2010-01-04','sale':'sale2'},
{'account_id':'4342343','entity':'entity3','system_value':10.2',date':'2010-01-04','sale':'sale3'},
]
作为pd进口熊猫
df=来自_记录的pd.DataFrame
打印(df)
#帐户\u id实体系统\u价值日期销售
#0 1166470734实体1 10.2 2010-01-02销售1
#1166470734实体2.2 2010-01-03销售1
#2 1232323实体2 4.2 2010-01-03销售2
#3 1232323实体2 5.2 2010-01-04销售2
#4 4342343实体3 10.2 2010-01-04销售
预期产出为:
# account_id entity 2010-01-02 2010-01-03 2010-01-04 total_value sale
# 0 1166470734 entity1 10.2 2.2 12.4 sale1
# 1 123232323 entity2 4.2 5.2 9.4 sale2
# 2 4342343 entity3 10.2 10.2 sale3
对不起,我是新手,如何才能得到预期的结果
根据@Ch3steR的回答更新我的问题:
我试过了,得到的错误如下所示
导入日期时间
从十进制输入十进制
作为pd进口熊猫
s=[
{'account_id':'21312312312','entity':'entityname1','ae':'lwe','is_pc':0,'type':2,'medium':0,'our_side_entity':3,'settlement_title':'settlementd','kim','settlement_type':0,'datetime.date(2020,4,9),'sale':'sale1','system_value':十进制('1038.36'),
{'account_id':'21312312312','entity':'entityname1','ae':'lwe','is_pc':0,'type':2,'medium':0,'our_side_entity':3,'settlement_title':'settlementd','kim','settlement_type':0,'datetime.date(2020,4,10),'sale':'sale1','system_value':十进制('1038.36'),
{'account_-id':'21312312312','entity':'entityname1','ae':'lwe','is_-pc':0,'type':2,'medium':0,'our_-side_-entity':3,'settlement_-title':'settlement_-type':0,'datetime.date(2020,4,11),'sale':'sale1','system_-value':十进制('1038.36'),
{'account_id':'21312312312','entity':'entityname1','ae':'lwe','is_pc':0,'type':2,'medium':0,'our_side_entity':3,'settlement_title':'settlementd','kim','settlement_type':0,'datetime.date(2020,4,12),'sale':'sale1','system_value':十进制('1038.36'),
{'account_id':'21312312312','entity':'entityname1','ae':'lwe','is_pc':0,'type':2,'medium':0,'our_side_entity':3,'settlement_title':'settlementd','kim','settlement_type':0,'datetime.date(2020,4,13),'sale':'sale1','system_value':十进制('1038.36'),
]
df=来自_记录的pd.DataFrame
df=df.pivot\u表(索引=['account\u id','entity','ae','is\u pc','type','medium','our\u side\u entity','columns\u title','columns\u type','sale'],value='system\u value')\
分配(总和=λx:x.sum(轴=1))\
重置索引()
打印(df)
#raise DATAFERROR(“没有要聚合的数字类型”)
#pandas.core.base.DataError:没有要聚合的数字类型
您可以使用
编辑:
查看df.dtypes
system\u值
为对象
类型。因此,出现了错误
df.dtypes
account_id object
entity object
. .
. .
. .
date object
sale object
system_value object
dtype: object
给出输出:
date account_id entity sale 2020-04-09 2020-04-10 2020-04-11 2020-04-12 2020-04-13 total_sum
0 21312312 entityname1 sale1 1038.36 1038.36 1038.36 1038.36 1038.36 5191.8
采用以下方法:
输出:
date entity sale 2010-01-02 2010-01-03 2010-01-04 total_value
0 entity1 sale1 10.2 2.2 0.0 12.4
1 entity2 sale2 0.0 4.2 5.2 9.4
2 entity3 sale3 0.0 0.0 10.2 10.2
谢谢你的回复,首先我选择了你的答案作为最佳答案,但我尝试了另一个例子,但失败了。我已经在上面更新了详细信息。@jiaJimmy没问题。;)因此鼓励你接受最适合你的答案。;)@我找到了错误的原因<代码>系统值类型为
对象
答案中提到了您要解决的问题。
date account_id entity sale 2020-04-09 2020-04-10 2020-04-11 2020-04-12 2020-04-13 total_sum
0 21312312 entityname1 sale1 1038.36 1038.36 1038.36 1038.36 1038.36 5191.8
(df.groupby(['entity','date','sale']).system_value.sum()
.unstack('date', fill_value=0)
.assign(total_value=lambda x: x.sum(1))
.reset_index()
)
date entity sale 2010-01-02 2010-01-03 2010-01-04 total_value
0 entity1 sale1 10.2 2.2 0.0 12.4
1 entity2 sale2 0.0 4.2 5.2 9.4
2 entity3 sale3 0.0 0.0 10.2 10.2