Python 3.x Pandas groupby:仅对部分记录进行聚合
我有以下数据框:Python 3.x Pandas groupby:仅对部分记录进行聚合,python-3.x,pandas,aggregate,pandas-groupby,Python 3.x,Pandas,Aggregate,Pandas Groupby,我有以下数据框: id src target duration 001 A C 4 001 B C 3 001 C C 2 002 B D 5 002 C D 2 我用下面的代码做了一些聚合,效果很好 df_new = df.groupby(['id','target']) \
id src target duration
001 A C 4
001 B C 3
001 C C 2
002 B D 5
002 C D 2
我用下面的代码做了一些聚合,效果很好
df_new = df.groupby(['id','target']) \
.apply(lambda x: pd.Series({'min_duration': min(x['duration']), \
'total_duration':sum(x['duration']), \
'all_src':list(x['src'])
})).reset_index()
现在我只想计算src!=目标
记录。我修改了我的代码,如下所示:
df_new = df.groupby(['id','target']) \
.apply(lambda x: pd.Series({'min_duration': min(x['duration']), \
'total_duration':sum(x['duration']), \
'total_duration_condition':sum(x['duration']) if x['src'] != x['target'], \
'all_src':list(x['src'])
})).reset_index()
但是在我的新行中出现了无效语法错误:
'total_duration_condition':sum(x['duration']) if x['src'] != x['target']
我想知道什么才是只对部分记录求和的正确方法?谢谢 试着像下面这样编写代码
df.groupby(['id','target']).apply(lambda x: pd.Series({'min_duration': min(x['duration']), \
'total_duration':sum(x['duration']), \
'total_duration_condition':sum(x['duration'][x['src'] != x['target']]), \# I change this part
'all_src':list(x['src'])
})).reset_index()
换乘线路
'total_duration_condition':sum(x['duration']) if x['src'] != x['target']
到
试着像下面这样编写代码
df.groupby(['id','target']).apply(lambda x: pd.Series({'min_duration': min(x['duration']), \
'total_duration':sum(x['duration']), \
'total_duration_condition':sum(x['duration'][x['src'] != x['target']]), \# I change this part
'all_src':list(x['src'])
})).reset_index()
换乘线路
'total_duration_condition':sum(x['duration']) if x['src'] != x['target']
到