Python 2.7 支持向量机的机器学习网格搜索_Python 2.7_Machine Learning_Scikit Learn_Svm_Grid Search

Python 2.7 支持向量机的机器学习网格搜索

python-2.7 machine-learning scikit-learn

Python 2.7 支持向量机的机器学习网格搜索,python-2.7,machine-learning,scikit-learn,svm,grid-search,Python 2.7,Machine Learning,Scikit Learn,Svm,Grid Search,我在做一个项目，我需要计算gridsearch返回的最佳估计器 parameters = {'gamma':[0.1, 0.5, 1, 10, 100], 'C':[1, 5, 10, 100, 1000]} # TODO: Initialize the classifier svr = svm.SVC() # TODO: Make an f1 scoring function using 'make_scorer' f1_scorer = make_scorer(score_func)

我在做一个项目，我需要计算gridsearch返回的最佳估计器

parameters = {'gamma':[0.1, 0.5, 1, 10, 100], 'C':[1, 5, 10, 100, 1000]}

# TODO: Initialize the classifier
svr = svm.SVC()

# TODO: Make an f1 scoring function using 'make_scorer' 
f1_scorer = make_scorer(score_func)

# TODO: Perform grid search on the classifier using the f1_scorer as the scoring method
grid_obj = grid_search.GridSearchCV(svr, parameters, scoring=f1_scorer)

# TODO: Fit the grid search object to the training data and find the optimal parameters
grid_obj = grid_obj.fit(X_train, y_train)
pred = grid_obj.predict(X_test)
def score_func():
    f1_score(y_test, pred, pos_label='yes')

# Get the estimator
clf = grid_obj.best_estimator_

我不知道如何使f1_记分器功能，因为我是在创建gridsearch对象后进行预测的。我无法在创建obj后声明f1_scorer，因为gridsearch将其用作评分方法。请帮助我如何为gridsearch创建此计分函数。

传递给

make\u scorer

的计分函数应为

y\u true

parameters = {'gamma':[0.1, 0.5, 1, 10, 100], 'C':[1, 5, 10, 100, 1000]}

# TODO: Initialize the classifier
svr = svm.SVC()

# TODO: Make an f1 scoring function using 'make_scorer' 
f1_scorer = make_scorer(score_func)

# TODO: Perform grid search on the classifier using the f1_scorer as the scoring method
grid_obj = grid_search.GridSearchCV(svr, parameters, scoring=f1_scorer)

# TODO: Fit the grid search object to the training data and find the optimal parameters
grid_obj = grid_obj.fit(X_train, y_train)
pred = grid_obj.predict(X_test)
def score_func():
    f1_score(y_test, pred, pos_label='yes')

# Get the estimator
clf = grid_obj.best_estimator_

和

y_pred

作为参数。有了这些信息，你就有了计算分数所需的一切。然后GridSearchCV将拟合并在内部为每个可能的参数集调用score函数，您无需事先计算y_pred

应该是这样的：

def score_func(y_true, y_pred):
    """Calculate f1 score given the predicted and expected labels"""
    return f1_score(y_true, y_pred, pos_label='yes')

f1_scorer = make_scorer(score_func)
GridSearchCV(svr, parameters, scoring=f1_scorer)

传递给

make_scorer

的记分器函数应为

y_true

和

y_pred

作为参数。有了这些信息，你就有了计算分数所需的一切。然后GridSearchCV将拟合并在内部为每个可能的参数集调用score函数，您无需事先计算y_pred

应该是这样的：

def score_func(y_true, y_pred):
    """Calculate f1 score given the predicted and expected labels"""
    return f1_score(y_true, y_pred, pos_label='yes')

f1_scorer = make_scorer(score_func)
GridSearchCV(svr, parameters, scoring=f1_scorer)

谢谢这很管用。如果我可以问一下，gridsearch是如何自行返回预测的？它是否与make_scorer函数有关？与您使用估计器的

.predict（）

方法所做的相同。它将数据内部拆分为验证集和测试集。然后拟合训练集（它是

X\u-train，y\u-train

的子集），并预测其内部测试集（它也是

X\u-train，y\u-train

的子集）并与之进行比较。所以它从不使用你的

xu测试。这是为了让你评估你的最终模型没有偏见！这很管用。如果我可以问一下，gridsearch是如何自行返回预测的？它是否与make_scorer函数有关？与您使用估计器的.predict（）
方法所做的相同。它将数据内部拆分为验证集和测试集。然后拟合训练集（它是X\u-train，y\u-train
的子集），并预测其内部测试集（它也是X\u-train，y\u-train
的子集）并与之进行比较。所以它从不使用你的xu测试。这是为了让你无偏见地评估你的最终模型