Python 通过交叉验证调整梯度增强回归中的参数，sklearn_Python_Machine Learning_Scikit Learn_Regression

Python 通过交叉验证调整梯度增强回归中的参数，sklearn

python machine-learning scikit-learn

Python 通过交叉验证调整梯度增强回归中的参数，sklearn,python,machine-learning,scikit-learn,regression,Python,Machine Learning,Scikit Learn,Regression,假设X_列的形状为（751411），Y_列的形状为（751L，）。我想使用网格搜索的交叉验证来找到GBR的最佳参数。我使用了以下代码，但未能成功 from sklearn.grid_search import GridSearchCV param_grid={'n_estimators':[100,500], 'learning_rate': [0.1,0.05,0.02], 'max_depth':[4], 'm

假设X_列的形状为（751411），Y_列的形状为（751L，）。我想使用网格搜索的交叉验证来找到GBR的最佳参数。我使用了以下代码，但未能成功

 from sklearn.grid_search import GridSearchCV 
 param_grid={'n_estimators':[100,500], 
            'learning_rate': [0.1,0.05,0.02],
            'max_depth':[4], 
            'min_samples_leaf':[3], 
            'max_features':[1.0] } 
n_jobs=4
def GradientBooster(param_grid, n_jobs):
    estimator = GradientBoostingRegressor() 
    classifier = GridSearchCV(estimator=estimator, cv=5, param_grid=param_grid, 
    n_jobs=n_jobs)
    classifier.fit(X_train, Y_train)
    print classifier.best_estimator_ 
return cv, classifier.best_estimator_ 
cv,best_est=GradientBooster(param_grid, n_jobs)

它给了我以下错误：

     51         from pandas.core.config import get_option
     52 
     53         encoding = get_option("display.encoding")
---> 54         return self.__unicode__().encode(encoding, 'replace')
        self.__unicode__.encode = undefined
        encoding = 'cp0'
     55 
     56     def __repr__(self):
     57         """
     58         Return a string representation for a particular object.

LookupError: unknown encoding: cp0

然后，我想使用这些参数来使用predict函数预测

X_test

我对以下代码也有同样的问题：

param_grid = {
    'bootstrap': [True],
    'max_depth': [80, 90, 100, 110],
    'max_features': [2, 3],
    'min_samples_leaf': [3, 4, 5],
    'min_samples_split': [8, 10, 12],
    'n_estimators': [100, 200, 300, 1000]
}

rf = RandomForestRegressor()
grid_search = GridSearchCV(estimator = rf, param_grid = param_grid, 
                          cv = 3, n_jobs = -1, verbose = 2)
grid_search.fit(X_train, Y_train)

下面是一个测试数据集的工作示例

from sklearn.datasets import load_boston
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import GridSearchCV

X,y = load_boston(return_X_y=True)

param_grid = {
    'bootstrap': [True],
    'max_depth': [80, 90, 100, 110],
    'max_features': [2, 3],
    'min_samples_leaf': [3, 4, 5],
    'min_samples_split': [8, 10, 12],
    'n_estimators': [100, 200, 300, 1000]
}

rf = RandomForestRegressor()
grid_search = GridSearchCV(estimator = rf, param_grid = param_grid, 
                          cv = 3, n_jobs = -1, verbose = 2)
grid_search.fit(X, y)

很可能您的数据有问题。

出现上述错误。我只是希望使用GBR进行交叉验证，然后对测试数据使用预测函数。错误如下：

cv，best\u est=GradientBooster（param\u grid，n\u jobs）

错误太长，文档中有43页。啊哈，你能添加一个小样本数据集吗，这将有助于我们重现此错误？

cv

未在您的代码中定义provided@pythonic833与

cv=5

不同吗？