Pandas 循环以乘以表中的上一个值
我在Pandas 循环以乘以表中的上一个值,pandas,Pandas,我在pandas中有一个DataFrame,如下所示: df = pd.DataFrame({'origin_dte':['2009-08-01','2009-08-01','2009-08-01','2009-08-01','2009-09-01','2009-09-01','2009-09-01'], 'date':['2009-08-01','2009-08-02','2009-08-03','2009-08-04','2009-09-01','200
pandas
中有一个DataFrame
,如下所示:
df = pd.DataFrame({'origin_dte':['2009-08-01','2009-08-01','2009-08-01','2009-08-01','2009-09-01','2009-09-01','2009-09-01'],
'date':['2009-08-01','2009-08-02','2009-08-03','2009-08-04','2009-09-01','2009-09-02','2009-09-03'],
'bal_pred':[10.,11.,12.,13.,21.,22.,23.],
'dbal_pred':[np.nan,.25,.3,.5,np.nan,.4,.45]})
bal_pred date dbal_pred origin_dte
0 10 2009-08-01 NaN 2009-08-01
1 11 2009-08-02 0.25 2009-08-01
2 12 2009-08-03 0.30 2009-08-01
3 13 2009-08-04 0.50 2009-08-01
4 21 2009-09-01 NaN 2009-09-01
5 22 2009-09-02 0.40 2009-09-01
6 23 2009-09-03 0.45 2009-09-01
我想循环并替换bal_pred
wheredbal_pred!=NaN
带有dbal\u pred[i]*bal\u pred[i-1]
。例如,bal_pred
的第二个值将变为0.25*10=2.5
。当origin\u dte
发生变化时,意味着dbal\u pred
再次为NaN
,计算将跳过NaN
观察,并计算下一个bal\u pred
。因此,df
将如下所示。我有一个while循环来实现这一点,但问题是在大数据帧中循环需要很长时间。非常感谢一种更简单/更快的方法
bal_pred date dbal_pred origin_dte
0 10.000 2009-08-01 NaN 2009-08-01
1 2.500 2009-08-02 0.25 2009-08-01
2 0.750 2009-08-03 0.30 2009-08-01
3 0.375 2009-08-04 0.50 2009-08-01
4 21.000 2009-09-01 NaN 2009-09-01
5 8.400 2009-09-02 0.40 2009-09-01
6 3.780 2009-09-03 0.45 2009-09-01
另一种方法是标记每组数据,然后取每组的累积乘积
group = df['dbal_pred'].isnull().cumsum()
df.dbal_pred.fillna(df.bal_pred, inplace=True)
df['bal_pred'] = df.groupby(group)['dbal_pred'].cumprod()
输出
bal_pred date dbal_pred origin_dte
0 10.000 2009-08-01 NaN 2009-08-01
1 2.500 2009-08-02 0.25 2009-08-01
2 0.750 2009-08-03 0.30 2009-08-01
3 0.375 2009-08-04 0.50 2009-08-01
4 21.000 2009-09-01 NaN 2009-09-01
5 8.400 2009-09-02 0.40 2009-09-01
6 3.780 2009-09-03 0.45 2009-09-01
bal_pred date dbal_pred origin_dte
0 10.000 2009-08-01 NaN 2009-08-01
1 2.500 2009-08-02 0.25 2009-08-01
2 0.750 2009-08-03 0.30 2009-08-01
3 0.375 2009-08-04 0.50 2009-08-01
4 21.000 2009-09-01 NaN 2009-09-01
5 8.400 2009-09-02 0.40 2009-09-01
6 3.780 2009-09-03 0.45 2009-09-01