python中的熊猫系列按月排序索引(时间序列不同)
我有一个序列对象,它具有:python中的熊猫系列按月排序索引(时间序列不同),python,pandas,datetime,Python,Pandas,Datetime,我有一个序列对象,它具有: df = index value 2014-05-23 07:00:00 0.67 2014-05-23 07:30:00 0.47 2014-05-23 08:00:00 0.42 2014-05-23 08:30:00 0.80 .... 2017-07-10 22:00:00 0.42 2017-07-10 22:30:00 0.79 2017-07-10 23:00:00
df =
index value
2014-05-23 07:00:00 0.67
2014-05-23 07:30:00 0.47
2014-05-23 08:00:00 0.42
2014-05-23 08:30:00 0.80
....
2017-07-10 22:00:00 0.42
2017-07-10 22:30:00 0.79
2017-07-10 23:00:00 0.84
2017-07-10 23:30:00 Nan
我想计算一年内的平均值,然后按月分组,因此数据框如下所示
df_new =
index value
Jan {0.11, 0.5, 0.3, 0.99, ... ,0.13} <- time step of each value is
Feb {...............................} still 30 min, and each
Mar {...............................} value is average of same
Apr {...............................} time in the other year.
....
Dec {...............................}
df_new=
索引值
Jan{0.11,0.5,0.3,0.99,…,0.13}我认为需要先通过以下方式进行上采样或下采样:
然后通过将月份转换为和,聚合平均值,最后通过以下方式重塑:
在日期、月份、小时和分钟中添加一列,例如:df['month']=df.index.month
。然后使用gb=df.groupby(by=['month','day','hour','minutes'])['value'].mean()
。看看gb数据帧是否与您想要的不接近。这太棒了!非常感谢:)@Chi-glady能帮上忙!
#upsample
s = s.resample('15Min').ffill()
#downsample
#s = s.resample('60Min').mean()
#if already 30 minutes values no resample necessary
cats = ['Jan', 'Feb', 'Mar', 'Apr','May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
months = pd.Categorical(s.index.strftime('%b'), categories=cats, ordered=True)
df = s.groupby([months, s.index.time]).mean().unstack()