Python 使用简单插补器用平均值填充空行时出错

Python 使用简单插补器用平均值填充空行时出错,python,python-3.x,scikit-learn,Python,Python 3.x,Scikit Learn,我正在尝试填充数据框中列的平均值 以下是这两列中的值: 0 -3.0 NaN 1 25.0 NaN 2 25.0 NaN 3 1937.0 NaN 4 1965.0 NaN 5 1993.0 NaN 6 2021.0 NaN 7 2021.0 NaN 8 2049.0 NaN 9 2077.0 NaN 10 2105.0 NaN 11 2133.0 NaN 12 2161.0 NaN 13 2189.0 Na

我正在尝试填充数据框中列的平均值

以下是这两列中的值:

0   -3.0    NaN
1   25.0    NaN
2   25.0    NaN
3   1937.0  NaN
4   1965.0  NaN
5   1993.0  NaN
6   2021.0  NaN
7   2021.0  NaN
8   2049.0  NaN
9   2077.0  NaN
10  2105.0  NaN
11  2133.0  NaN
12  2161.0  NaN
13  2189.0  NaN
14  2217.0  NaN

from sklearn.impute import SimpleImputer 
imputer = SimpleImputer(missing_values=np.nan, strategy='mean')

imputer = imputer.fit(data[['column1']])
data['column1'] = imputer.transform(data[['column1']]).ravel()

imputer = imputer.fit(data[['column2']])
data['column2'] = imputer.transform(data[['column2']]).ravel()
第一个插补器.fit工作正常,第二个插补器抛出错误

下面是错误“ValueError:值的长度与索引的长度不匹配”和整个错误堆栈

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-30-44682728650c> in <module>()
      3 
      4 imputer = imputer.fit(data[['column2']])
----> 5 data['column2'] = imputer.transform(data[['column2']]).ravel()

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
   3368         else:
   3369             # set column
-> 3370             self._set_item(key, value)
   3371 
   3372     def _setitem_slice(self, key, value):

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/frame.py in _set_item(self, key, value)
   3443 
   3444         self._ensure_valid_index(value)
-> 3445         value = self._sanitize_column(key, value)
   3446         NDFrame._set_item(self, key, value)
   3447 

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/frame.py in _sanitize_column(self, key, value, broadcast)
   3628 
   3629             # turn me into an ndarray
-> 3630             value = sanitize_index(value, self.index, copy=False)
   3631             if not isinstance(value, (np.ndarray, Index)):
   3632                 if isinstance(value, list) and len(value) > 0:

~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/internals/construction.py in sanitize_index(data, index, copy)
    517 
    518     if len(data) != len(index):
--> 519         raise ValueError('Length of values does not match length of index')
    520 
    521     if isinstance(data, ABCIndexClass) and not copy:

ValueError: Length of values does not match length of index
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
3.
4插补器=插补器拟合(数据['column2']]
---->5数据['column2']=插补器.transform(数据['column2']]).ravel()
~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/frame.py in\uuuuuuuuu setitem\uuuuuuu(self、key、value)
3368其他:
3369#设置列
->3370自我设置项目(键、值)
3371
3372 def_设置项_切片(自身、键、值):
~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/frame.py in\u set\u项(self、key、value)
3443
3444自我确保有效索引(值)
->3445 value=self.\u sanitize\u列(键,值)
3446 NDFrame.\u设置\u项(自身、键、值)
3447
~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/frame.py(self、key、value、broadcast)
3628
3629#把我变成一个废物
->3630值=清理索引(值,self.index,copy=False)
3631如果不存在(值,(np.ndarray,索引)):
3632如果isinstance(值,列表)和len(值)>0:
消毒索引中的~/anaconda3/envs/python3/lib/python3.6/site-packages/pandas/core/internals/construction.py(数据、索引、副本)
517
518如果len(数据)!=len(索引):
-->519 raise VALUERROR('值的长度与索引的长度不匹配')
520
521如果存在(数据、ABSS)且未复制:
ValueError:值的长度与索引的长度不匹配

您可以共享数据帧的一部分吗?“问题来自于它,而不是插补者。”尼古拉斯。以下是列中的值。第2列中是否至少有一个not NaN值?“平均”策略至少需要一个数值。@NicolasM就是这样。对不起,我一开始没有检查。