在python中将值列表转换为时间序列

在python中将值列表转换为时间序列,python,datetime,numpy,pandas,time-series,Python,Datetime,Numpy,Pandas,Time Series,我想换成福勒。数据: jan_1 jan_15 feb_1 feb_15 mar_1 mar_15 apr_1 apr_15 may_1 may_15 jun_1 jun_15 jul_1 jul_15 aug_1 aug_15 sep_1 sep_15 oct_1 oct_15 nov_1 nov_15 dec_1 dec_15 0 0 0 0 0 1

我想换成福勒。数据:

jan_1   jan_15  feb_1   feb_15  mar_1   mar_15  apr_1   apr_15  may_1   may_15  jun_1   jun_15  jul_1   jul_15  aug_1   aug_15  sep_1   sep_15  oct_1   oct_15  nov_1   nov_15  dec_1   dec_15
0       0       0       0       0       1       1       2       2       2       2       2       2        3      3       3       3       3       0       0       0       0       0       0
进入一个长度为365的数组,其中每个元素重复到下一个日期,例如0从1月1日到1月15日重复

我可以做一些类似于
numpy.repeat
,但这是不知道日期的,所以不会考虑到
2015年2月
1年3月
之间的时间少于15天

任何用于此的pythonic解决方案?

您可以使用:


IIUC您可以这样做:

In [194]: %paste
# transpose DF, rename columns
x = df.T.reset_index().rename(columns={'index':'date', 0:'val'})
# parse dates
x['date'] = pd.to_datetime(x['date'], format='%b_%d')
# group resampled DF by the month and resample(`D`) each group 
result = (x.groupby(x['date'].dt.month)
           .apply(lambda x: x.set_index('date').resample('1D').ffill()))
# rename index names
result.index.names = ['month','date']
## -- End pasted text --

In [212]: result
Out[212]:
                  val
month date
1     1900-01-01    0
      1900-01-02    0
      1900-01-03    0
      1900-01-04    0
      1900-01-05    0
      1900-01-06    0
      1900-01-07    0
      1900-01-08    0
      1900-01-09    0
      1900-01-10    0
      1900-01-11    0
      1900-01-12    0
      1900-01-13    0
      1900-01-14    0
      1900-01-15    0
2     1900-02-01    0
      1900-02-02    0
      1900-02-03    0
      1900-02-04    0
      1900-02-05    0
      1900-02-06    0
      1900-02-07    0
      1900-02-08    0
      1900-02-09    0
      1900-02-10    0
      1900-02-11    0
      1900-02-12    0
      1900-02-13    0
      1900-02-14    0
      1900-02-15    0
...               ...
11    1900-11-01    0
      1900-11-02    0
      1900-11-03    0
      1900-11-04    0
      1900-11-05    0
      1900-11-06    0
      1900-11-07    0
      1900-11-08    0
      1900-11-09    0
      1900-11-10    0
      1900-11-11    0
      1900-11-12    0
      1900-11-13    0
      1900-11-14    0
      1900-11-15    0
12    1900-12-01    0
      1900-12-02    0
      1900-12-03    0
      1900-12-04    0
      1900-12-05    0
      1900-12-06    0
      1900-12-07    0
      1900-12-08    0
      1900-12-09    0
      1900-12-10    0
      1900-12-11    0
      1900-12-12    0
      1900-12-13    0
      1900-12-14    0
      1900-12-15    0

[180 rows x 1 columns]
或者使用
重置索引()


我认为输出的长度并不像OP希望的那个样是365。@jezrael,我不这么认为。。。OP表示:每个元素重复到下一个日期天,例如从1月1日到1月15日重复
0
Ok,但这句话以
开始,进入一个长度为365的数组
。。。没问题,如果你是对的,你的答案会被接受。你的问题不清楚,。如何确定0,1,2,并显示您的疲劳程度。
          col  
1900-01-01  0
1900-01-02  0
1900-01-03  0
1900-01-04  0
1900-01-05  0
1900-01-06  0
1900-01-07  0
1900-01-08  0
1900-01-09  0
1900-01-10  0
1900-01-11  0
1900-01-12  0
1900-01-13  0
1900-01-14  0
1900-01-15  0
1900-01-16  0
1900-01-17  0
1900-01-18  0
1900-01-19  0
1900-01-20  0
1900-01-21  0
1900-01-22  0
1900-01-23  0
1900-01-24  0
1900-01-25  0
1900-01-26  0
1900-01-27  0
1900-01-28  0
1900-01-29  0
1900-01-30  0
       ..
1900-12-02  0
1900-12-03  0
1900-12-04  0
1900-12-05  0
1900-12-06  0
1900-12-07  0
1900-12-08  0
1900-12-09  0
1900-12-10  0
1900-12-11  0
1900-12-12  0
1900-12-13  0
1900-12-14  0
1900-12-15  0
1900-12-16  0
1900-12-17  0
1900-12-18  0
1900-12-19  0
1900-12-20  0
1900-12-21  0
1900-12-22  0
1900-12-23  0
1900-12-24  0
1900-12-25  0
1900-12-26  0
1900-12-27  0
1900-12-28  0
1900-12-29  0
1900-12-30  0
1900-12-31  0

[365 rows x 1 columns]
#if need serie
print (df1.col)
1900-01-01    0
1900-01-02    0
1900-01-03    0
1900-01-04    0
1900-01-05    0
1900-01-06    0
1900-01-07    0
1900-01-08    0
1900-01-09    0
1900-01-10    0
1900-01-11    0
1900-01-12    0
1900-01-13    0
1900-01-14    0
1900-01-15    0
1900-01-16    0
1900-01-17    0
1900-01-18    0
1900-01-19    0
1900-01-20    0
1900-01-21    0
1900-01-22    0
1900-01-23    0
1900-01-24    0
1900-01-25    0
1900-01-26    0
1900-01-27    0
1900-01-28    0
1900-01-29    0
1900-01-30    0
             ..
1900-12-02    0
1900-12-03    0
1900-12-04    0
1900-12-05    0
1900-12-06    0
1900-12-07    0
1900-12-08    0
1900-12-09    0
1900-12-10    0
1900-12-11    0
1900-12-12    0
1900-12-13    0
1900-12-14    0
1900-12-15    0
1900-12-16    0
1900-12-17    0
1900-12-18    0
1900-12-19    0
1900-12-20    0
1900-12-21    0
1900-12-22    0
1900-12-23    0
1900-12-24    0
1900-12-25    0
1900-12-26    0
1900-12-27    0
1900-12-28    0
1900-12-29    0
1900-12-30    0
1900-12-31    0
Freq: D, Name: col, dtype: int64
#transpose and convert to numpy array
print (df1.T.values)
[[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
  1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2
  2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
  2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
  3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
  3 3 3 3 3 3 3 3 3 3 3 3 3 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]
In [194]: %paste
# transpose DF, rename columns
x = df.T.reset_index().rename(columns={'index':'date', 0:'val'})
# parse dates
x['date'] = pd.to_datetime(x['date'], format='%b_%d')
# group resampled DF by the month and resample(`D`) each group 
result = (x.groupby(x['date'].dt.month)
           .apply(lambda x: x.set_index('date').resample('1D').ffill()))
# rename index names
result.index.names = ['month','date']
## -- End pasted text --

In [212]: result
Out[212]:
                  val
month date
1     1900-01-01    0
      1900-01-02    0
      1900-01-03    0
      1900-01-04    0
      1900-01-05    0
      1900-01-06    0
      1900-01-07    0
      1900-01-08    0
      1900-01-09    0
      1900-01-10    0
      1900-01-11    0
      1900-01-12    0
      1900-01-13    0
      1900-01-14    0
      1900-01-15    0
2     1900-02-01    0
      1900-02-02    0
      1900-02-03    0
      1900-02-04    0
      1900-02-05    0
      1900-02-06    0
      1900-02-07    0
      1900-02-08    0
      1900-02-09    0
      1900-02-10    0
      1900-02-11    0
      1900-02-12    0
      1900-02-13    0
      1900-02-14    0
      1900-02-15    0
...               ...
11    1900-11-01    0
      1900-11-02    0
      1900-11-03    0
      1900-11-04    0
      1900-11-05    0
      1900-11-06    0
      1900-11-07    0
      1900-11-08    0
      1900-11-09    0
      1900-11-10    0
      1900-11-11    0
      1900-11-12    0
      1900-11-13    0
      1900-11-14    0
      1900-11-15    0
12    1900-12-01    0
      1900-12-02    0
      1900-12-03    0
      1900-12-04    0
      1900-12-05    0
      1900-12-06    0
      1900-12-07    0
      1900-12-08    0
      1900-12-09    0
      1900-12-10    0
      1900-12-11    0
      1900-12-12    0
      1900-12-13    0
      1900-12-14    0
      1900-12-15    0

[180 rows x 1 columns]
In [213]: result.reset_index().head(20)
Out[213]:
    month       date  val
0       1 1900-01-01    0
1       1 1900-01-02    0
2       1 1900-01-03    0
3       1 1900-01-04    0
4       1 1900-01-05    0
5       1 1900-01-06    0
6       1 1900-01-07    0
7       1 1900-01-08    0
8       1 1900-01-09    0
9       1 1900-01-10    0
10      1 1900-01-11    0
11      1 1900-01-12    0
12      1 1900-01-13    0
13      1 1900-01-14    0
14      1 1900-01-15    0
15      2 1900-02-01    0
16      2 1900-02-02    0
17      2 1900-02-03    0
18      2 1900-02-04    0
19      2 1900-02-05    0