填充0'；使用python计算3个月前的平均值_Python_Pandas

填充0'；使用python计算3个月前的平均值

python pandas

填充0'；使用python计算3个月前的平均值,python,pandas,Python,Pandas,我的数据集的值如下 date quantity 01/04/2018 35 01/05/2018 33 01/06/2018 75 01/07/2018 0 01/08/2018 70 01/09/2018 0 01/10/2018 66 我试过的代码： df['rollmean3'] = df['quantity'].rolling(3).mean() 输出： 201

我的数据集的值如下

date           quantity
01/04/2018        35
01/05/2018        33
01/06/2018        75
01/07/2018         0
01/08/2018        70
01/09/2018         0
01/10/2018        66

我试过的代码：

df['rollmean3']  = df['quantity'].rolling(3).mean()

输出：

2018-04-01  35.0    NaN
2018-05-01  33.0    NaN
2018-06-01  75.0    47.666667
2018-07-01  0.0     36.000000
2018-08-01  70.0    48.333333
2018-09-01  0.0     23.333333
2018-10-01  66.0    45.333333

预期产出：但我需要输出，因为它应该取35、33和75的平均值，然后填入0.0的值。对于下一个零，它应该计算前三个值的平均值并填充它

2018-04-01  35.0    
2018-05-01  33.0    
2018-06-01  75.0    
2018-07-01  0.0     47.666667
2018-08-01  70.0    
2018-09-01  0.0     64.22222 # average of (0, 47.6667 and 75)
2018-10-01  66.0

与此类似，应显示输出

不幸的是，在Pandas中似乎没有针对此问题的矢量化解决方案。您需要迭代这些行，并逐个填充缺少的值。这将是缓慢的；如果需要加速，可以使用JIT编译代码。

正如John Zwinck所说，pandas中没有矢量化的解决方案

您必须使用类似于

.iterrows（）

的内容，例如：

for i, row in df.iterrows():
    if row['quantity'] == 0:
        df.loc[i,'quantity'] = df['quantity'].iloc[(i-3):i].mean()

如果您愿意，甚至可以使用递归：

def fill_recursively(column: pd.Series, window_size: int = 3):
    if 0 in column.values:
        idx = column.tolist().index(0)
        column[idx] = column[(idx-window_size):idx].mean()
        column = fill_recursively(column)
    return column

您可以验证

fill_递归（df['quantity']）

是否返回所需的结果（只需确保它具有dtype float，否则将四舍五入到最接近的整数）