Python 用前后的平均值填充包含NaN的单元格_Python_Pandas_Scikit Learn

Python 用前后的平均值填充包含NaN的单元格

python pandas scikit-learn

Python 用前后的平均值填充包含NaN的单元格,python,pandas,scikit-learn,Python,Pandas,Scikit Learn,我想用缺失值前后单元格的平均值来填充pandas数据帧中缺失的值。如果它是[1，NaN，3]，NaN值将是2，因为（1+3）/2。我找不到任何方法来做这件事与熊猫或Scikit学习。有什么方法可以做到这一点吗？如果您没有任何NaN值作为最后一个索引，这将起作用，您的插补方法暗示了这一点 >>> data = pd.DataFrame({'a': [10, 6, -3, -2, 4, 12, 3, 3], 'b': [6, -3, np.nan, 12, 8, 11, -5,

我想用缺失值前后单元格的平均值来填充pandas数据帧中缺失的值。如果它是[1，NaN，3]，NaN值将是2，因为（1+3）/2。我找不到任何方法来做这件事与熊猫或Scikit学习。有什么方法可以做到这一点吗？

如果您没有任何

NaN

值作为最后一个索引，这将起作用，您的插补方法暗示了这一点

>>> data = pd.DataFrame({'a': [10, 6, -3, -2, 4, 12, 3, 3], 
'b': [6, -3, np.nan, 12, 8, 11, -5, -5], 
'id': [1, 1, 1, 1, np.nan, 2, 2, 4]})
>>> data
    a     b   id
0  10   6.0  1.0
1   6  -3.0  1.0
2  -3   NaN  1.0
3  -2  12.0  1.0
4   4   8.0  NaN
5  12  11.0  2.0
6   3  -5.0  2.0
7   3  -5.0  4.0



>>> nan_cols = data.columns[data.isnull().any(axis=0)]
>>> for col in nan_cols:
...     for i in range(len(data)):
...             if pd.isnull(data.loc[i, col]):
...                     data.loc[i, col] = (data.loc[i-1, col] + data.loc[i+1, col])/2


>>> data
    a     b   id
0  10   6.0  1.0
1   6  -3.0  1.0
2  -3   4.5  1.0
3  -2  12.0  1.0
4   4   8.0  1.5
5  12  11.0  2.0
6   3  -5.0  2.0
7   3  -5.0  4.0

考虑这个数据帧

df = pd.DataFrame({'val': [1,np.nan, 4, 5, np.nan, 10]})

    val
0   1.0
1   NaN
2   4.0
3   5.0
4   NaN
5   10.0

您可以使用fillna和shift（）来获得所需的输出

df.val = df.val.fillna((df.val.shift() + df.val.shift(-1))/2)

你得到

使用spies006的示例df

df = pd.DataFrame({'a': [10, 6, -3, -2, 4, 12, 3, 3], 
'b': [6, -3, np.nan, 12, 8, 11, -5, -5], 
'id': [1, 1, 1, 1, np.nan, 2, 2, 4]})

#use np.where to locate the nans and fill it with the average of surrounding elements.
df.where(df.notnull(), other=(df.fillna(method='ffill')+df.fillna(method='bfill'))/2)
Out[2517]: 
    a     b   id
0  10   6.0  1.0
1   6  -3.0  1.0
2  -3   4.5  1.0
3  -2  12.0  1.0
4   4   8.0  1.5
5  12  11.0  2.0
6   3  -5.0  2.0
7   3  -5.0  4.0

@如果序列中有多个空值，则此解决方案不起作用。你已经结束了一个问题，它解决了这个特殊的问题，作为这个问题的重复，而这个问题不是。这是菜单，请重新打开。