Python 3.x sklearn.impute.SimpleImputer,Nan表示不工作

Python 3.x sklearn.impute.SimpleImputer,Nan表示不工作,python-3.x,scikit-learn,Python 3.x,Scikit Learn,我有一个dataset Data.csv Country,Age,Salary,Purchased France,44,72000,No Spain,27,48000,Yes Germany,30,54000,No Spain,38,61000,No Germany,40,,Yes France,35,58000,Yes Spain,,52000,No France,48,79000,Yes Germany,50,83000,No France,37,67000,Yes 我尝试使用以下代码使用

我有一个dataset Data.csv

Country,Age,Salary,Purchased
France,44,72000,No
Spain,27,48000,Yes
Germany,30,54000,No
Spain,38,61000,No
Germany,40,,Yes
France,35,58000,Yes
Spain,,52000,No
France,48,79000,Yes
Germany,50,83000,No
France,37,67000,Yes
我尝试使用以下代码使用sklearn.impute.SimpleImputer填充nan值

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Importing the dataset
dataset = pd.read_csv('Data.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

# Taking care of missing data
from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values = 'NaN', strategy = 'mean')
imputer = imputer.fit(X[:, 1:3])
X[:, 1:3] = imputer.transform(X[:, 1:3])
但我得到一个错误,它说:

File "C:\Users\Krishna Rohith\Machine Learning A-Z\Part 1 - Data Preprocessing\Section 2 ----------- --------- Part 1 - Data Preprocessing --------------------\missing_data.py", line 16, in <module>
imputer = imputer.fit(X[:, 1:3])

File "C:\Users\Krishna Rohith\Anaconda3\lib\site-packages\sklearn\impute\_base.py", line 268, in fit
X = self._validate_input(X)

File "C:\Users\Krishna Rohith\Anaconda3\lib\site-packages\sklearn\impute\_base.py", line 242, in _validate_input
raise ve

File "C:\Users\Krishna Rohith\Anaconda3\lib\site-packages\sklearn\impute\_base.py", line 235, in _validate_input
force_all_finite=force_all_finite, copy=self.copy)

File "C:\Users\Krishna Rohith\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 562, in check_array
allow_nan=force_all_finite == 'allow-nan')

File "C:\Users\Krishna Rohith\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 60, in _assert_all_finite
msg_dtype if msg_dtype is not None else X.dtype)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
文件“C:\Users\Krishna Rohith\Machine Learning A-Z\Part 1-数据预处理\Section 2-------------------Part 1-数据预处理-----------------\missing_Data.py”,第16行,在
插补器=插补器拟合(X[:,1:3])
文件“C:\Users\Krishna Rohith\Anaconda3\lib\site packages\sklearn\impute\\ u base.py”,第268行,适合
X=自我验证输入(X)
文件“C:\Users\Krishna Rohith\Anaconda3\lib\site packages\sklearn\impute\\ u base.py”,第242行,在“验证”输入中
提高价值
文件“C:\Users\Krishna Rohith\Anaconda3\lib\site packages\sklearn\impute\\ u base.py”,第235行,输入
force\u all\u finite=force\u all\u finite,copy=self.copy)
文件“C:\Users\Krishna Rohith\Anaconda3\lib\site packages\sklearn\utils\validation.py”,第562行,在check\u数组中
allow_nan=force_all_finite==‘allow nan’)
文件“C:\Users\Krishna Rohith\Anaconda3\lib\site packages\sklearn\utils\validation.py”,第60行,在“assert\u all\u finite”中
msg\u dtype(如果msg\u dtype不是None else X.dtype)
ValueError:输入包含NaN、无穷大或对数据类型('float64')太大的值。
我知道怎么做,但是有人能告诉我如何使用sklearn.impute吗?

imputer=simplemputer(缺少值=np.nan,策略='mean')

将“NaN”替换为numpy默认值NaN np.NaN

插补器=简单计算机(缺少_值=np.nan,策略='mean')

将“NaN”替换为numpy默认值NaN np.NaN


你可以试试看:希望这有帮助!很抱歉我的不一样。你可以试试看这个:希望这有帮助!很抱歉我的不一样。与那件事无关。