Python 基于组重新采样数据并计算滚动和_Python_Pandas_Dataframe

Python 基于组重新采样数据并计算滚动和

python pandas dataframe

Python 基于组重新采样数据并计算滚动和,python,pandas,dataframe,Python,Pandas,Dataframe,我想在我的数据框中创建一个额外的列，而不必循环执行这些步骤 This is created in the following steps. 1.Start from end of the data.For each date resample every nth row (in this case its 5th) from the end. 2.Take the rolling sum of x numbers from 1 (x=2) a worked example for

我想在我的数据框中创建一个额外的列，而不必循环执行这些步骤

This is created in the following steps.

 1.Start from end of the data.For each date resample every nth row 
 (in this case its 5th) from the end.
 2.Take the rolling sum of x numbers from 1 (x=2)

 a worked example for 
 11/22:5,7,3,2 (every 5th row being picked) but x=2 so 5+7=12
 11/15:6,5,2 (every 5th row being picked) but x=2 so 6+5=11


        cumulative 
 8/30/2019  2   
 9/6/2019   4   
 9/13/2019  1   
 9/20/2019  2   
 9/27/2019  3   5
 10/4/2019  3   7
 10/11/2019 5   6
 10/18/2019 5   7
 10/25/2019 7   10
 11/1/2019  4   7
 11/8/2019  9   14
 11/15/2019 6   11
 11/22/2019 5   12

假设我们有一组15个整数：

df = pd.DataFrame([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15], columns=['original_data'])

我们定义应添加第n行的

以及添加第n行的次数

n = 5
x = 2

(
    df

    # Add `x` columsn which are all shifted `n` rows
    .assign(**{
        'n{} x{}'.format(n, x): df['original_data'].shift(n*x)
        for x in range(1, reps)})

    # take the rowwise sum
    .sum(axis=1)
)

输出：

    original_data   n5 x1
0   1               NaN
1   2               NaN
2   3               NaN
3   4               NaN
4   5               NaN
5   6               1.0
6   7               2.0
7   8               3.0
8   9               4.0
9   10              5.0
10  11              6.0
11  12              7.0
12  13              8.0
13  14              9.0
14  15              10.0