Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/305.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在嵌套交叉验证中打印所选参数_Python_Scikit Learn - Fatal编程技术网

Python 在嵌套交叉验证中打印所选参数

Python 在嵌套交叉验证中打印所选参数,python,scikit-learn,Python,Scikit Learn,下面是一个使用scikit learn从k近邻获得交叉验证预测的示例,其中k是通过交叉验证选择的。代码似乎有效,但如何打印在每个外部折叠中选择的k import numpy as np, sklearn n = 100 X = np.random.randn(n, 2) y = np.where(np.sum(X, axis = 1) + np.random.randn(n) > 0, "blue", "red") preds = sklearn.model_selection.cro

下面是一个使用scikit learn从k近邻获得交叉验证预测的示例,其中k是通过交叉验证选择的。代码似乎有效,但如何打印在每个外部折叠中选择的k

import numpy as np, sklearn

n = 100
X = np.random.randn(n, 2)
y = np.where(np.sum(X, axis = 1) + np.random.randn(n) > 0, "blue", "red")

preds = sklearn.model_selection.cross_val_predict(
    X = X,
    y = y,
    estimator = sklearn.model_selection.GridSearchCV(
       estimator = sklearn.neighbors.KNeighborsClassifier(),
       param_grid = {'n_neighbors': range(1, 7)},
       cv = sklearn.model_selection.KFold(10, random_state = 133),
       scoring = 'accuracy'),
    cv = sklearn.model_selection.KFold(10, random_state = 144))

您无法直接从该函数中获取此信息,因此您需要将
交叉值预测
替换为
交叉验证
,并将
返回估计器
标志设置为
。然后,您可以使用键
estimator
选择返回字典中使用的估计器。估计器的选定参数存储在属性
最佳参数
中。所以

import numpy as np
import sklearn
# sklearn 0.20.3 doesn't seem to import submodules in __init__
# So importing them directly is required.
import sklearn.model_selection
import sklearn.neighbors

n = 100
X = np.random.randn(n, 2)
y = np.where(np.sum(X, axis = 1) + np.random.randn(n) > 0, "blue", "red")

scores = sklearn.model_selection.cross_validate(
    X = X,
    y = y,
    estimator = sklearn.model_selection.GridSearchCV(
       estimator = sklearn.neighbors.KNeighborsClassifier(),
       param_grid = {'n_neighbors': range(1, 7)},
       cv = sklearn.model_selection.KFold(10, random_state = 133),
       scoring = 'accuracy'),
    cv = sklearn.model_selection.KFold(10, random_state = 144),
    return_estimator=True)

# Selected hyper-parameters for the estimator from the first fold
print(scores['estimator'][0].best_params_)
不幸的是,您无法从同一函数中获得实际预测和选定的超参数。如果需要,则必须手动执行嵌套交叉验证:

cv = sklearn.model_selection.KFold(10, random_state = 144)
estimator = sklearn.model_selection.GridSearchCV(
       estimator = sklearn.neighbors.KNeighborsClassifier(),
       param_grid = {'n_neighbors': range(1, 7)},
       cv = sklearn.model_selection.KFold(10, random_state = 133),
       scoring = 'accuracy')
for train, test in cv.split(X,y):
    X_train, y_train = X[train], y[train]
    X_test, y_test = X[test], y[test]
    m = estimator.fit(X_train, y_train)
    print(m.best_params_)
    y_pred = m.predict(X_test)
    print(y_pred)

那么,你知道我如何在不额外运行模型的情况下获得预测和选择的参数吗?