Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/307.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 交叉验证后分层交叉验证和分层折叠。分层折叠弃用_Python_Machine Learning_Scikit Learn - Fatal编程技术网

Python 交叉验证后分层交叉验证和分层折叠。分层折叠弃用

Python 交叉验证后分层交叉验证和分层折叠。分层折叠弃用,python,machine-learning,scikit-learn,Python,Machine Learning,Scikit Learn,我遵循了3年前的一些示例脚本,遇到了一个使用不推荐的函数的函数定义(cross_validation.StratifiedKFold) 以下是3年前的原始代码片段: def stratified_cv(X, y, clf_class, shuffle=True, n_folds=10, **kwargs): stratified_k_fold = cross_validation.StratifiedKFold(y, n_folds=n_folds, shuffle=shuffle)

我遵循了3年前的一些示例脚本,遇到了一个使用不推荐的函数的函数定义(cross_validation.StratifiedKFold)

以下是3年前的原始代码片段:

def stratified_cv(X, y, clf_class, shuffle=True, n_folds=10, **kwargs):
    stratified_k_fold = cross_validation.StratifiedKFold(y, n_folds=n_folds, shuffle=shuffle)
    y_pred = y.copy()
    # ii -> train
    # jj -> test indices
    for ii, jj in stratified_k_fold: 
        X_train, X_test = X[ii], X[jj]
        y_train = y[ii]
        clf = clf_class(**kwargs)
        clf.fit(X_train,y_train)
        y_pred[jj] = clf.predict(X_test)
    return y_pred
我尝试通过以下关于sklearn.model_selection.StratifiedKFold()的文档对其进行更新,到目前为止,我已经做到了以下几点:

## Attempt to modernize with StratifiedKFold from sklearn.model_selection
def stratified_cv(X, y, clf_class, shuffle=True, n_splits=10, **kwargs):
    stratified_k_fold = StratifiedKFold(n_splits=n_splits)
    y_pred = y.copy()
    # ii -> train
    # jj -> test indices
    for ii, jj in stratified_k_fold: 
        X_train, X_test = X[ii], X[jj]
        y_train = y[ii]
        clf = clf_class(**kwargs)
        clf.fit(X_train,y_train)
        y_pred[jj] = clf.predict(X_test)
    return y_pred
然后,我尝试运行以下块并遇到后续错误:

print('Gradient Boosting Classifier:  {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, ensemble.GradientBoostingClassifier))))
print('Support vector machine(SVM):   {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, svm.SVC))))
print('Random Forest Classifier:      {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, ensemble.RandomForestClassifier))))
print('K Nearest Neighbor Classifier: {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, neighbors.KNeighborsClassifier))))
print('Logistic Regression:           {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, linear_model.LogisticRegression))))
错误:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-122-a61be22f8ca9> in <module>
----> 1 print('Gradient Boosting Classifier:  {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, ensemble.GradientBoostingClassifier))))
      2 print('Support vector machine(SVM):   {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, svm.SVC))))
      3 print('Random Forest Classifier:      {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, ensemble.RandomForestClassifier))))
      4 print('K Nearest Neighbor Classifier: {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, neighbors.KNeighborsClassifier))))
      5 print('Logistic Regression:           {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, linear_model.LogisticRegression))))

<ipython-input-121-e373d74b2cca> in stratified_cv(X, y, clf_class, shuffle, n_splits, **kwargs)
      5     # ii -> train
      6     # jj -> test indices
----> 7     for ii, jj in stratified_k_fold:
      8         X_train, X_test = X[ii], X[jj]
      9         y_train = y[ii]

TypeError: 'StratifiedKFold' object is not iterable
---------------------------------------------------------------------------
TypeError回溯(最近一次调用上次)
在里面
---->1打印('Gradient Boosting Classifier:{.2f}'。格式(度量.精度\分数(y,分层\ cv(X,y,集合.GradientBoostingClassifier)))
2打印('Support vector machine(SVM):{.2f}'。格式(度量.准确度\分数(y,分层\ cv(X,y,SVM.SVC)))
3打印('Random Forest Classifier:{.2f}'。格式(度量.精度\分数(y,分层\ cv(X,y,ensemble.RandomForestClassifier)))
4打印('K最近邻分类器:{.2f}'。格式(度量.准确度\分数(y,分层\ cv(X,y,邻居.KNeighborsClassifier)))
5打印('Logistic回归:{.2f}'。格式(指标、准确度、得分(y、分层、线性、模型、Logistic回归)))
在分层循环中(X、y、clf类、洗牌、n分裂、**kwargs)
5#ii->列车
6#jj->测试指标
---->7对于ii,层状褶皱中的jj:
8列X_列车,X_试验=X[ii],X[jj]
9 y_列车=y[ii]
TypeError:“StratifiedKFold”对象不可编辑

您需要使用StratifiedKFold来分割数据,而不需要对代码进行太多更改,下面应该可以使用:

from sklearn.model_selection import StratifiedKFold
from sklearn import datasets
from sklearn import metrics
from sklearn import svm

iris = datasets.load_iris()
X = iris.data
y = iris.target

def stratified_cv(X, y, clf_class, shuffle=True, n_splits=10, **kwargs):
    stratified_k_fold = StratifiedKFold(n_splits=n_splits)
    y_pred = y.copy()

    for ii,jj in stratified_k_fold.split(X, y):
            
        y_train = y[ii]
        X_train, X_test = X[ii], X[jj]
        clf = clf_class(**kwargs)
        clf.fit(X_train,y_train)
        y_pred[jj] = clf.predict(X_test)
    return y_pred
    
print('Gradient Boosting Classifier:  {:.2f}'.format(metrics.accuracy_score(y, stratified_cv(X, y, svm.SVC))))

这很有魅力!非常感谢:D