Python 我想用z矩阵所有值的平均值来填充z矩阵中缺少的值
我想用各列的方法并使用以下代码在列中填充缺失的数据:Python 我想用z矩阵所有值的平均值来填充z矩阵中缺少的值,python,pandas,machine-learning,Python,Pandas,Machine Learning,我想用各列的方法并使用以下代码在列中填充缺失的数据: #Data Preprocessing #Importing libraries import numpy as np import pandas as pd import matplotlib.pyplot as plt #Importing dataset dataset = pd.read_csv('Book1.csv') x = dataset.iloc[:, :-2].values y = dataset.iloc[:, -2].
#Data Preprocessing
#Importing libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#Importing dataset
dataset = pd.read_csv('Book1.csv')
x = dataset.iloc[:, :-2].values
y = dataset.iloc[:, -2].values
z = dataset.iloc[:, 4].values
#Dealing with missing data
from sklearn.preprocessing import Imputer
imputer = Imputer()
imputer = imputer.fit(x[:,1:3])
imputer = imputer.fit(z[:])
x[:, 1:3] = imputer.transform(x[:, 1:3])
z[:] = imputer.transform(z[:])
当我尝试运行此操作时,我得到一个错误:
Traceback (most recent call last):
File "<ipython-input-24-f33b6b1880df>", line 15, in <module>
imputer = imputer.fit(z[:])
File "C:\ProgramData\Anaconda3\lib\site- packages\sklearn\preprocessing\imputation.py", line 155, in fit
force_all_finite=False)
File "C:\ProgramData\Anaconda3\lib\site-packages\sklearn\utils\validation.py", line 441, in check_array
"if it contains a single sample.".format(array))
ValueError: Expected 2D array, got 1D array instead:
array=[ 1. 3. 4. nan 5. 7. 6. 9. 8. 10.].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample
我应该在代码中更改什么来填充“test”列中缺少的数据。我尝试将“test column”包含在x中,显然,您正在使用一个
Imputer
实例来插补x
(2D)和z
(1D)数组。您应该为两个变量创建单独的插补器:
imputer_x = Imputer()
imputer_z = Imputer()
imputer_x.fit(x[:,1:3])
imputer_z.fit(z[:])
x[:, 1:3] = imputer_x.transform(x[:, 1:3])
z[:] = imputer_z.transform(z[:])
imputer_x = Imputer()
imputer_z = Imputer()
imputer_x.fit(x[:,1:3])
imputer_z.fit(z[:])
x[:, 1:3] = imputer_x.transform(x[:, 1:3])
z[:] = imputer_z.transform(z[:])