Python 使用NamedAgg聚合具有条件的数据帧
我有一个orders表,列order_state。并且需要计算每个订单状态的订单数量,按小时分组,但不使用“按订单分组\状态”列。我想用无名匕首。可能吗?大概是这样的:Python 使用NamedAgg聚合具有条件的数据帧,python,pandas,dataframe,aggregate,Python,Pandas,Dataframe,Aggregate,我有一个orders表,列order_state。并且需要计算每个订单状态的订单数量,按小时分组,但不使用“按订单分组\状态”列。我想用无名匕首。可能吗?大概是这样的: orders_agg = orders.groupby( by=[pandas.Grouper(key='created_at', freq='H'), 'source'] ).agg( orders_count=pandas.NamedAgg('created_at', 'count'), finish
orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'finished').count()),
cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').count())
).reset_index().rename(columns={'created_at': 'datetime_msk'})
结果应该是:
但现在我得到了每列的订单总数。我认为您需要将count
的.count()
更改为的.sum()
,以获得count的值:
orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: x == 'finished').sum(),
cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').sum())
).reset_index().rename(columns={'created_at': 'datetime_msk'})
我认为您需要将countTrue
s值的.count()
更改为.sum()
:
orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: x == 'finished').sum(),
cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').sum())
).reset_index().rename(columns={'created_at': 'datetime_msk'})