Pandas 循环以乘以表中的上一个值

Pandas 循环以乘以表中的上一个值,pandas,Pandas,我在pandas中有一个DataFrame,如下所示: df = pd.DataFrame({'origin_dte':['2009-08-01','2009-08-01','2009-08-01','2009-08-01','2009-09-01','2009-09-01','2009-09-01'], 'date':['2009-08-01','2009-08-02','2009-08-03','2009-08-04','2009-09-01','200

我在
pandas
中有一个
DataFrame
,如下所示:

df = pd.DataFrame({'origin_dte':['2009-08-01','2009-08-01','2009-08-01','2009-08-01','2009-09-01','2009-09-01','2009-09-01'],
                   'date':['2009-08-01','2009-08-02','2009-08-03','2009-08-04','2009-09-01','2009-09-02','2009-09-03'],
                   'bal_pred':[10.,11.,12.,13.,21.,22.,23.],
                   'dbal_pred':[np.nan,.25,.3,.5,np.nan,.4,.45]})

    bal_pred   date   dbal_pred origin_dte
0   10      2009-08-01  NaN     2009-08-01
1   11      2009-08-02  0.25    2009-08-01
2   12      2009-08-03  0.30    2009-08-01
3   13      2009-08-04  0.50    2009-08-01
4   21      2009-09-01  NaN     2009-09-01
5   22      2009-09-02  0.40    2009-09-01
6   23      2009-09-03  0.45    2009-09-01
我想循环并替换
bal_pred
where
dbal_pred!=NaN
带有
dbal\u pred[i]*bal\u pred[i-1]
。例如,
bal_pred
的第二个值将变为
0.25*10=2.5
。当
origin\u dte
发生变化时,意味着
dbal\u pred
再次为
NaN
,计算将跳过
NaN
观察,并计算下一个
bal\u pred
。因此,
df
将如下所示。我有一个while循环来实现这一点,但问题是在大数据帧中循环需要很长时间。非常感谢一种更简单/更快的方法

    bal_pred  date       dbal_pred  origin_dte
0   10.000    2009-08-01    NaN     2009-08-01
1   2.500     2009-08-02    0.25    2009-08-01
2   0.750     2009-08-03    0.30    2009-08-01
3   0.375     2009-08-04    0.50    2009-08-01
4   21.000    2009-09-01    NaN     2009-09-01
5   8.400     2009-09-02    0.40    2009-09-01
6   3.780     2009-09-03    0.45    2009-09-01

另一种方法是标记每组数据,然后取每组的累积乘积

group = df['dbal_pred'].isnull().cumsum() 
df.dbal_pred.fillna(df.bal_pred, inplace=True)
df['bal_pred'] = df.groupby(group)['dbal_pred'].cumprod()
输出

   bal_pred        date  dbal_pred  origin_dte
0    10.000  2009-08-01        NaN  2009-08-01
1     2.500  2009-08-02       0.25  2009-08-01
2     0.750  2009-08-03       0.30  2009-08-01
3     0.375  2009-08-04       0.50  2009-08-01
4    21.000  2009-09-01        NaN  2009-09-01
5     8.400  2009-09-02       0.40  2009-09-01
6     3.780  2009-09-03       0.45  2009-09-01
   bal_pred        date  dbal_pred  origin_dte
0    10.000  2009-08-01        NaN  2009-08-01
1     2.500  2009-08-02       0.25  2009-08-01
2     0.750  2009-08-03       0.30  2009-08-01
3     0.375  2009-08-04       0.50  2009-08-01
4    21.000  2009-09-01        NaN  2009-09-01
5     8.400  2009-09-02       0.40  2009-09-01
6     3.780  2009-09-03       0.45  2009-09-01