Python 我可以将GridSearchCV与KNeighboursRegressor一起使用吗？_Python_Scikit Learn_Knn

Python 我可以将GridSearchCV与KNeighboursRegressor一起使用吗？

python scikit-learn

Python 我可以将GridSearchCV与KNeighboursRegressor一起使用吗？,python,scikit-learn,knn,Python,Scikit Learn,Knn,我有一个数据集，其中包含一些浮点列特性（X_列）和一个连续目标（y_列）我想在数据集上运行KNN回归，并且我想（1）对超参数优化进行网格搜索，（2）对训练进行交叉验证我写了这段代码： from sklearn.model_selection import KFold from sklearn.model_selection import GridSearchCV from sklearn.model_selection import train_test_split from sklearn

我有一个数据集，其中包含一些浮点列特性（X_列）和一个连续目标（y_列）

我想在数据集上运行KNN回归，并且我想（1）对超参数优化进行网格搜索，（2）对训练进行交叉验证

我写了这段代码：

from sklearn.model_selection import KFold
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
from sklearn.model_selection import RepeatedStratifiedKFold
X_train, X_test, y_train, y_test = train_test_split(scaled_df, target, test_size=0.2)

cv_method = RepeatedStratifiedKFold(n_splits=5, 
                                    n_repeats=3, 
                                    random_state=999)


# Define our candidate hyperparameters
hp_candidates = [{'n_neighbors': [2,3,4,5,6,7,8,9,10,11,12,13,14,15], 'weights': ['uniform','distance'],'p':[1,2,5]}]

# Search for best hyperparameters
grid = GridSearchCV(estimator=KNeighborsRegressor(), 
                      param_grid=hp_candidates, 
                      cv=cv_method,
                      verbose=1,  
                      scoring='accuracy', 
                      return_train_score=True)

grid.fit(X_train,y_train)

我得到的错误是：

Supported target types are: ('binary', 'multiclass'). Got 'continuous' instead.

我理解这个错误，我只能对KNN进行分类，不能进行回归

但我找不到的是如何编辑此代码以使其适合KNN回归？有人能告诉我怎么做吗

（因此，最终目标是我有一个数据集，我想调整参数，进行交叉验证，并输出基于上述的最佳模型，并获得一些准确度分数，理想情况下，这些分数在其他算法中具有可比分数，并且不特定于KNN，因此我可以比较准确度）

另外，这是我第一次尝试在scikitlearn中使用KNN，因此欢迎所有评论/评论。

是的，您可以将GridSearchCV与KNeighboursRegressor一起使用

因为你有一个度量选择问题，您可以在此处阅读度量文档：

适用于回归问题的指标不同于分类问题，此处列出了适用于回归指标的列表：

‘explained_variance’
‘max_error’
‘neg_mean_absolute_error’
‘neg_mean_squared_error’
‘neg_root_mean_squared_error’
‘neg_mean_squared_log_error’
‘neg_median_absolute_error’
‘r2’
‘neg_mean_poisson_deviance’
‘neg_mean_gamma_deviance’
‘neg_mean_absolute_percentage_error’

因此，您可以选择一个替换“准确性”并进行测试。

您是否可以共享一部分数据，例如五个任意样本？哪一行导致了这个错误？此外，您正在使用

精度

作为回归任务的指标，但这并不好，请看这是您的问题分类，还是回归？这是回归（y_序列/标签是连续的）。Mustafa，我可以发布一些行，但是每行有150多列，所以我不确定空间是否合适（？）；但是每行大约有150个浮点值（特征），一个y标签也是浮点值。然后，正如@MustafaAydın所说的，您不能使用

精度作为度量标准。