在python中使用sklearn.preprocessing中的OneHotEncoder时,在输出中获得意外的\n和\t
我正在使用在python中使用sklearn.preprocessing中的OneHotEncoder时,在输出中获得意外的\n和\t,python,machine-learning,scikit-learn,Python,Machine Learning,Scikit Learn,我正在使用onehotcoder将字符串列转换为数字。但是,在输出矩阵中,它给出了一些意想不到的\n和\t字符。我的代码的最后5行导致了错误 import pandas as pd import numpy as np car_sales_missing = pd.read_csv("Data/car-sales-extended-missing-data.csv") X = car_sales_missing.drop("Price",axis=1)
onehotcoder
将字符串列转换为数字。但是,在输出矩阵中,它给出了一些意想不到的\n
和\t
字符。我的代码的最后5行导致了错误
import pandas as pd
import numpy as np
car_sales_missing = pd.read_csv("Data/car-sales-extended-missing-data.csv")
X = car_sales_missing.drop("Price",axis=1)
y = car_sales_missing["Price"]
from sklearn.model_selection import train_test_split as splt
X_train, X_test, y_train, y_test = splt(X,y,test_size=0.2)
from sklearn.impute import SimpleImputer
from sklearn.compose import ColumnTransformer
cat_imputer = SimpleImputer(strategy="constant",fill_value="missing")
door_imputer = SimpleImputer(strategy="constant",fill_value=4)
num_imputer = SimpleImputer(strategy="mean")
cat_features = ["Make","Colour"]
door_features = ["Doors"]
num_features = ["Odometer (KM)"]
imputer = ColumnTransformer([("cat_imputer",cat_imputer,cat_features),
("num_imputer",num_imputer,num_features),("door_imputer",door_imputer,door_features)])
filled_X = imputer.fit_transform(X)
car_sales_filled = pd.DataFrame(filled_X,columns=["Make","Colour","Odometer (KM)","Doors"])
X = car_sales_filled
from sklearn.preprocessing import OneHotEncoder
categorical_features = ["Make","Colour","Doors"]
one_hot = OneHotEncoder()
transformer = ColumnTransformer([("one_hot",one_hot,categorical_features)],remainder="passthrough")
transformed_X = transformer.fit_transform(X)
pd.DataFrame(transformed_X)