Scikit learn simpleimputer无法处理我的数据

Scikit learn simpleimputer无法处理我的数据,scikit-learn,sklearn-pandas,imputation,data-wrangling,Scikit Learn,Sklearn Pandas,Imputation,Data Wrangling,全部, 我的数据中有np.nans和np.inf。我想用0替换这些,但是当我执行以下操作时,会出现以下错误: imputer = SimpleImputer(missing_values=np.nan, strategy='constant', fill_value=0) features_to_impute = data_fe.columns.tolist() data_fe[features_to_impute] = pd.DataFrame(imputer.fit_transform(d

全部,

我的数据中有np.nans和np.inf。我想用0替换这些,但是当我执行以下操作时,会出现以下错误:

imputer = SimpleImputer(missing_values=np.nan, strategy='constant', fill_value=0)
features_to_impute = data_fe.columns.tolist()

data_fe[features_to_impute] = pd.DataFrame(imputer.fit_transform(data_fe[features_to_impute]), 
                                           columns=features_to_impute)


ValueError: Input contains infinity or a value too large for dtype('float64').

不知道如何处理这个问题,有人知道我如何处理这个问题并同时计算inf吗?

如果您想将
np.nan
np.inf
替换为
0
,只需使用
np.nan\u to\u num

np.nan_to_num(data_fe[features_to_impute], nan=0, posinf=0, neginf=0)
例如:

a = np.array([[1, 2, np.nan, 5],
              [-np.inf, 9,3,np.nan],
              [8, np.inf, np.nan,9]])

Out[441]:
array([[  1.,   2.,  nan,   5.],
       [-inf,   9.,   3.,  nan],
       [  8.,  inf,  nan,   9.]])

b = np.nan_to_num(a, nan=0, posinf=0, neginf=0)

Out[444]:
array([[1., 2., 0., 5.],
       [0., 9., 3., 0.],
       [8., 0., 0., 9.]])
因此,在您的例子中,只需将数据帧的选定列传递给
np.nan\u to_num

np.nan_to_num(data_fe[features_to_impute], nan=0, posinf=0, neginf=0)