Python 大熊猫一组一组找到中间值_Python_Pandas_Pandas Groupby

Python 大熊猫一组一组找到中间值

python pandas

Python 大熊猫一组一组找到中间值,python,pandas,pandas-groupby,Python,Pandas,Pandas Groupby,df.head（10）.到剪贴板（sep='；'，index=True）我有一个如上所述的dataframe，我有以下列描述 • Id - the uuid of this delivery • PlanId - the uuid of the plan (the plan for deliveries of a given day) • PlanDate - the date of delivery • MinTime - the minimal time (seconds

df.head（10）.到剪贴板（sep='；'，index=True）

我有一个如上所述的dataframe，我有以下列描述

•   Id - the uuid of this delivery
•   PlanId - the uuid of the plan (the plan for deliveries of a given day)
 •  PlanDate - the date of delivery

•   MinTime - the minimal time (seconds from midnight) for delivering this delivery
•   MaxTime - the maximal time (seconds from midnight) for delivering this delivery
•   RouteId - the uuid of the route this delivery belongs to
•   ETA - the estimated time for arrival of this delivery on this date (from the eta you can of course order the deliveries in a route)
•   TTN - the time to next delivery in the route, i.e., at index 3 that would be the time distance between delivery index 3 and delivery index 4
•   DTN - the distance to next delivery in the route.

我需要找到给定计划中每条路线的配送中位数

给定计划中每条路线行驶的中间距离

给定计划中每条路线的平均行驶时间

我该怎么做

我想知道这是否只是一个简单的中位数计算，你只是分组和汇总我试过这样的方法来找出中间距离

Tx = df.groupby(by=['plan_id','route_id'], as_index=False)['dtn'].sum()


 Tx.groupby(['plan_id','route_id'])['dtn'].median()

但是，我可能不确定这是否正确。

以下是显示所需数字的方法：

#将数据帧子集为仅具有所需的计划id
sub_Tx=Tx[Tx['plan_id']=“869BB6FB-…]”
#给定计划中每条路线的交货中位数
sub_df=sub_Tx['计划id'，'路线id']]
sub_df[“计数交付”]=1
sub_df=sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。sum（）
sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。中位数（）
#给定计划中每条路线行驶的中间距离
sub_df=sub_Tx['计划id'，路线id'，dtn']
sub_df=sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。sum（）
sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。中位数（）
#给定计划中每条路线的平均行驶时间
sub_df=sub_Tx['计划id'，路线id'，ttn']
sub_df=sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。sum（）
sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。中位数（）

祝你好运

更新：

因此，您可以根据计划id计算路线数字的中位数（nb交付量、距离和时间），如下所示：

#给定计划中每条路线的交付中值
sub_df=sub_Tx['计划id'，'路线id']]
sub_df[“计数交付”]=1
sub_df=sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。sum（）
sub_df=sub_df['plan_id'，'count_deliveries']]。重命名（列={'count_deliveries'：'media_deliveries'}）
sub_df.groupby（by=['plan_id']，axis=0，as_index=False）
#给定计划中每条路线行驶的中间距离
sub_df=sub_Tx['计划id'，路线id'，dtn']
sub_df=sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。sum（）
sub_df=sub_df[['plan_id'，'dtn']]。重命名（列={'dtn'：'median_dtn'}）
sub_df.groupby（by=['plan_id']，axis=0，as_index=False）
#给定计划中每条路线的平均行驶时间
sub_df=sub_Tx['计划id'，路线id'，ttn']
sub_df=sub_df.groupby（by=['plan_id'，'route_id'，]，axis=0，as_index=False）。sum（）
sub_df=sub_df[['plan_id'，'ttn']]。重命名（列={'ttn'：'median_ttn'}）
sub_df.groupby（by=['plan_id']，axis=0，as_index=False）

请以文本格式提供样本数据。无法通过图像数据再现示例。您好，感谢您的回复，如果您注意到中值的结果与求和的结果相同。这是正确的吗？您好，每个计划有不同的路线id吗？我刚刚编辑了我的帖子，计算每个计划的中位数。groupby（by=['plan\u id'，'route\u id'，'dtn']，axis=0，as\u index=False）。sum（）sub\u df.groupby（by=['plan\u id'，axis=0，as\u index=False）。中位数（）这就是我在距离和时间上的做法吗？是的，似乎是这样，你得到了预期的结果吗？