Python 基于事件的快速时间序列时间范围计算
我有一个timeseries(freq='D')事件,如果没有事件,它的值为0,如果有事件,它的值为1。通常情况下,这会在连续几天内发生 我想计算事件时间范围内的两个变量:Python 基于事件的快速时间序列时间范围计算,python,pandas,datetime,time-series,Python,Pandas,Datetime,Time Series,我有一个timeseries(freq='D')事件,如果没有事件,它的值为0,如果有事件,它的值为1。通常情况下,这会在连续几天内发生 我想计算事件时间范围内的两个变量: 表示事件开始后的周数的值(星期六为一周的结束) 事件开始后一周内的日期编号 下面是我正在尝试做的一个例子 # Dummy up a test frame date = pd.date_range(start='20150101', end='20150121', freq='D') event = np.zeros(len(
# Dummy up a test frame
date = pd.date_range(start='20150101', end='20150121', freq='D')
event = np.zeros(len(date))
event[2:5] = 1.
event[15:20] = 1.
df_test = pd.DataFrame({'date': date, 'event': event})
数据如下所示。如您所见,事件在时间范围内发生两次。我计算了“抓拍日期”,因此它指的是一周中的星期六
In[2]: df_test
Out[2]:
date event
0 2015-01-01 0.0
1 2015-01-02 0.0
2 2015-01-03 1.0
3 2015-01-04 1.0
4 2015-01-05 1.0
5 2015-01-06 0.0
6 2015-01-07 0.0
7 2015-01-08 0.0
8 2015-01-09 0.0
9 2015-01-10 0.0
10 2015-01-11 0.0
11 2015-01-12 0.0
12 2015-01-13 0.0
13 2015-01-14 0.0
14 2015-01-15 0.0
15 2015-01-16 1.0
16 2015-01-17 1.0
17 2015-01-18 1.0
18 2015-01-19 1.0
19 2015-01-20 1.0
20 2015-01-21 0.0
我开始计算每个日期的周界限,如下所示:
df_test.loc[:, 'snapped_date'] = df_test.date.map(pd.tseries.frequencies.to_offset('W-SAT').rollforward)
现在,我想计算以下两个新列:
date snapped_date event week_of_event day_within_week_of_event
0 2015-01-01 2015-01-03 0.0 0.0 0.0
1 2015-01-02 2015-01-03 0.0 0.0 0.0
2 2015-01-03 2015-01-03 1.0 1.0 1.0
3 2015-01-04 2015-01-10 1.0 2.0 1.0
4 2015-01-05 2015-01-10 1.0 2.0 2.0
5 2015-01-06 2015-01-10 0.0 0.0 0.0
6 2015-01-07 2015-01-10 0.0 0.0 0.0
7 2015-01-08 2015-01-10 0.0 0.0 0.0
8 2015-01-09 2015-01-10 0.0 0.0 0.0
9 2015-01-10 2015-01-10 0.0 0.0 0.0
10 2015-01-11 2015-01-17 0.0 0.0 0.0
11 2015-01-12 2015-01-17 0.0 0.0 0.0
12 2015-01-13 2015-01-17 0.0 0.0 0.0
13 2015-01-14 2015-01-17 0.0 0.0 0.0
14 2015-01-15 2015-01-17 0.0 0.0 0.0
15 2015-01-16 2015-01-17 1.0 1.0 1.0
16 2015-01-17 2015-01-17 1.0 1.0 2.0
17 2015-01-18 2015-01-24 1.0 2.0 1.0
18 2015-01-19 2015-01-24 1.0 2.0 2.0
19 2015-01-20 2015-01-24 1.0 2.0 3.0
20 2015-01-21 2015-01-24 0.0 0.0 0.0
熊猫中是否有任何时间序列功能可以帮助我以一种快速的方式完成这项工作?我有多个这样的tseries,希望最终能够进行分组转换。使用这个丑陋的解决方案,可以实现这一点
df['new']=((df.date.dt.dayofweek+1)//7).cumsum()
df['new2']=df.event.diff().ne(0).cumsum()
df['week_of_event']=df.loc[df.event!=0].groupby('new2').new.apply(lambda x : x.rolling(len(x), min_periods=1).apply(lambda y: len(np.unique(y))))
df['day_within_week_of_event']=df.loc[df.event!=0].groupby(['new2','week_of_event']).cumcount()+1
df.fillna(0)
Out[140]:
date event new new2 week_of_event day_within_week_of_event
0 2015-01-01 0.0 0 1 0.0 0.0
1 2015-01-02 0.0 0 1 0.0 0.0
2 2015-01-03 1.0 0 2 1.0 1.0
3 2015-01-04 1.0 1 2 2.0 1.0
4 2015-01-05 1.0 1 2 2.0 2.0
5 2015-01-06 0.0 1 3 0.0 0.0
6 2015-01-07 0.0 1 3 0.0 0.0
7 2015-01-08 0.0 1 3 0.0 0.0
8 2015-01-09 0.0 1 3 0.0 0.0
9 2015-01-10 0.0 1 3 0.0 0.0
10 2015-01-11 0.0 2 3 0.0 0.0
11 2015-01-12 0.0 2 3 0.0 0.0
12 2015-01-13 0.0 2 3 0.0 0.0
13 2015-01-14 0.0 2 3 0.0 0.0
14 2015-01-15 0.0 2 3 0.0 0.0
15 2015-01-16 1.0 2 4 1.0 1.0
16 2015-01-17 1.0 2 4 1.0 2.0
17 2015-01-18 1.0 3 4 2.0 1.0
18 2015-01-19 1.0 3 4 2.0 2.0
19 2015-01-20 1.0 3 4 2.0 3.0
20 2015-01-21 0.0 3 5 0.0 0.0