Python 使用NamedAgg聚合具有条件的数据帧

Python 使用NamedAgg聚合具有条件的数据帧,python,pandas,dataframe,aggregate,Python,Pandas,Dataframe,Aggregate,我有一个orders表,列order_state。并且需要计算每个订单状态的订单数量,按小时分组,但不使用“按订单分组\状态”列。我想用无名匕首。可能吗?大概是这样的: orders_agg = orders.groupby( by=[pandas.Grouper(key='created_at', freq='H'), 'source'] ).agg( orders_count=pandas.NamedAgg('created_at', 'count'), finish

我有一个orders表,列order_state。并且需要计算每个订单状态的订单数量,按小时分组,但不使用“按订单分组\状态”列。我想用无名匕首。可能吗?大概是这样的:

orders_agg = orders.groupby(
    by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
    orders_count=pandas.NamedAgg('created_at', 'count'),
    finished_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'finished').count()),
    cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').count())
).reset_index().rename(columns={'created_at': 'datetime_msk'})
结果应该是:
但现在我得到了每列的订单总数。

我认为您需要将count
的.count()
更改为
的.sum()
,以获得count
的值:

orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: x == 'finished').sum(),
    cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').sum())
).reset_index().rename(columns={'created_at': 'datetime_msk'})

我认为您需要将count
True
s值的
.count()
更改为
.sum()

orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: x == 'finished').sum(),
    cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').sum())
).reset_index().rename(columns={'created_at': 'datetime_msk'})