Python 数据帧上的多个操作
我试着用一列或两列进行分组,求第四列的值和第五列的平均值。每个操作都要写在单独的输出中。一开始就觉得有点棘手 输入:无标题,超过100k行Python 数据帧上的多个操作,python,pandas,dataframe,aggregate,median,Python,Pandas,Dataframe,Aggregate,Median,我试着用一列或两列进行分组,求第四列的值和第五列的平均值。每个操作都要写在单独的输出中。一开始就觉得有点棘手 输入:无标题,超过100k行 StartTime, EndTime,Day,SumCount,UniqueCount 00:00:00,01:00:00,Mon,13534,594 01:00:00,02:00:00,Mon,16674,626 02:00:00,03:00:00,Mon,23736,671 03:00:00,04:00:00,Mon,16977,671 00:00:00
StartTime, EndTime,Day,SumCount,UniqueCount
00:00:00,01:00:00,Mon,13534,594
01:00:00,02:00:00,Mon,16674,626
02:00:00,03:00:00,Mon,23736,671
03:00:00,04:00:00,Mon,16977,671
00:00:00,01:00:00,Tue,17262,747
01:00:00,02:00:00,Tue,19072,777
02:00:00,03:00:00,Tue,18275,785
03:00:00,04:00:00,Tue,13589,757
04:00:00,05:00:00,Tue,16053,735
05:00:00,06:00:00,Tue,11440,636
我想找到的是
StartTime
和EndTime
查找SumCount
和UniqueCount的中位数
Day
查找SumCount
和UniqueCount的中位数
df.groupby(['StartTime', 'EndTime']).agg({'SumCount': ['sum'],
'UniqueCount': {'median': lambda x: np.median(x).round(0)}})
或:
谢谢!但中位数不能有浮点值。“我该如何弥补呢?”SitzBlogz说,非常感谢。如果我有任何错误,我会写回。再次感谢
df.groupby(['Day']).agg({'SumCount': ['sum'],
'UniqueCount': {'median': lambda x: np.median(x).round(0)}})