Python 3.x 基于2个变量约束计算数据帧中的滚动和

Python 3.x 基于2个变量约束计算数据帧中的滚动和,python-3.x,pandas,Python 3.x,Pandas,我想创建一个变量:sumofPreious5OccurnceSatidlevel,它是Var1在ID级别(第1列)的前5个值(根据日期变量)的总和,否则它将取NA值 样本数据和输出: ID Date Var1 SumOfPrevious5OccurencesAtIDLevel 1 1/1/2018 0 NA 1 1/2/2018 1 NA 1 1/3/2018 2 NA 1 1/4/2018 3 NA 2 1/1/2018

我想创建一个变量:sumofPreious5OccurnceSatidlevel,它是Var1在ID级别(第1列)的前5个值(根据日期变量)的总和,否则它将取NA值

样本数据和输出:

ID  Date      Var1  SumOfPrevious5OccurencesAtIDLevel
1   1/1/2018    0   NA
1   1/2/2018    1   NA
1   1/3/2018    2   NA
1   1/4/2018    3   NA
2   1/1/2018    4   NA
2   1/2/2018    5   NA
2   1/3/2018    6   NA
2   1/4/2018    7   NA
2   1/5/2018    8   NA
2   1/6/2018    9   30
2   1/7/2018    10  35
2   1/8/2018    11  40
与和功能一起使用,以及:


如果数据没有按ID和日期排序,那么?df['new']=df.sort_values(['ID','Date']).groupby('ID')['Var1']).transform(lambda x:x.rolling(5.sum().shift())@user3643528-很好,然后需要将列转换为datetime,并按照编辑后的答案进行排序。
df['Date'] = pd.to_datetime(df['Date'], format='%m/%d/%Y')
#if not sorted ID with datetimes
df = df.sort_values(['ID','Date'])

df['new'] = df.groupby('ID')['Var1'].transform(lambda x: x.rolling(5).sum().shift())
print (df)
    ID       Date  Var1  SumOfPrevious5OccurencesAtIDLevel   new
0    1 2018-01-01     0                                NaN   NaN
1    1 2018-01-02     1                                NaN   NaN
2    1 2018-01-03     2                                NaN   NaN
3    1 2018-01-04     3                                NaN   NaN
4    2 2018-01-01     4                                NaN   NaN
5    2 2018-01-02     5                                NaN   NaN
6    2 2018-01-03     6                                NaN   NaN
7    2 2018-01-04     7                                NaN   NaN
8    2 2018-01-05     8                                NaN   NaN
9    2 2018-01-06     9                               30.0  30.0
10   2 2018-01-07    10                               35.0  35.0
11   2 2018-01-08    11                               40.0  40.0