Scikit learn 在管道中获取带有分类功能的Xgboost时出错
我通过一个管道运行xgboost,我有许多分类功能,在管道中我使用了一种热编码,但最后还是出现了一个错误,即“ValueError:DataFrame.dtypes for data必须是int、float或bool。 如果onehot编码器已经将分类特征转换为数字,为什么会出现此错误Scikit learn 在管道中获取带有分类功能的Xgboost时出错,scikit-learn,pipeline,xgboost,categorical-data,Scikit Learn,Pipeline,Xgboost,Categorical Data,我通过一个管道运行xgboost,我有许多分类功能,在管道中我使用了一种热编码,但最后还是出现了一个错误,即“ValueError:DataFrame.dtypes for data必须是int、float或bool。 如果onehot编码器已经将分类特征转换为数字,为什么会出现此错误 # selecting nuemrical features numeric_features = X_train.select_dtypes(include=np.number).columns # sele
# selecting nuemrical features
numeric_features = X_train.select_dtypes(include=np.number).columns
# selecting categorical features
categorical_features = X_train.select_dtypes(exclude=np.number).columns
# scaling pipeline for numerical features
numeric_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')),
('scaler', StandardScaler())])
# scaling and encoding pipeline for categorical features
categorical_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='constant', fill_value='Missing')),
('onehot', OneHotEncoder(handle_unknown='ignore'))])
#combine the preprocessing steps into a single pipeline
preprocessor = ColumnTransformer(transformers=[('num', numeric_transformer, numeric_features),
('cat', categorical_transformer, categorical_features)])
# setting up the pipeline
pipe = Pipeline(steps=[('preprocessor', preprocessor),
('xgb', XGBClassifier(random_state=10))])
param_grid = {
"xgb__n_estimators": [100, 500, 700],
"xgb__learning_rate": [0.001, 0.1, 0.5, 1],
"xgb__max_depth" : [4, 5],
"xgb__alpha": [0, 0.25, 0.5, 0.75, 1],
"xgb__lambda": [0, 0.2, 0.4, 0.6, 0.8, 1]
}
fit_param = {"xgb__eval_set": [(X_test, y_test)],
"xgb__early_stopping_rounds": 10,
"xgb__verbose": False}
xgbmodel = GridSearchCV(pipe, cv=5, param_grid=param_grid, scoring='accuracy')
xgbmodel.fit(X_train, y_train, **fit_params)
print(xgbmodel.best_params_)