Scikit learn simpleimputer无法处理我的数据
全部, 我的数据中有np.nans和np.inf。我想用0替换这些,但是当我执行以下操作时,会出现以下错误:Scikit learn simpleimputer无法处理我的数据,scikit-learn,sklearn-pandas,imputation,data-wrangling,Scikit Learn,Sklearn Pandas,Imputation,Data Wrangling,全部, 我的数据中有np.nans和np.inf。我想用0替换这些,但是当我执行以下操作时,会出现以下错误: imputer = SimpleImputer(missing_values=np.nan, strategy='constant', fill_value=0) features_to_impute = data_fe.columns.tolist() data_fe[features_to_impute] = pd.DataFrame(imputer.fit_transform(d
imputer = SimpleImputer(missing_values=np.nan, strategy='constant', fill_value=0)
features_to_impute = data_fe.columns.tolist()
data_fe[features_to_impute] = pd.DataFrame(imputer.fit_transform(data_fe[features_to_impute]),
columns=features_to_impute)
ValueError: Input contains infinity or a value too large for dtype('float64').
不知道如何处理这个问题,有人知道我如何处理这个问题并同时计算inf吗?如果您想将
np.nan
和np.inf
替换为0
,只需使用np.nan\u to\u num
np.nan_to_num(data_fe[features_to_impute], nan=0, posinf=0, neginf=0)
例如:
a = np.array([[1, 2, np.nan, 5],
[-np.inf, 9,3,np.nan],
[8, np.inf, np.nan,9]])
Out[441]:
array([[ 1., 2., nan, 5.],
[-inf, 9., 3., nan],
[ 8., inf, nan, 9.]])
b = np.nan_to_num(a, nan=0, posinf=0, neginf=0)
Out[444]:
array([[1., 2., 0., 5.],
[0., 9., 3., 0.],
[8., 0., 0., 9.]])
因此,在您的例子中,只需将数据帧的选定列传递给np.nan\u to_num
np.nan_to_num(data_fe[features_to_impute], nan=0, posinf=0, neginf=0)