Python 多标签OneVsRestClassifier的GridSearch?
我正在对多标签数据进行网格搜索,如下所示:Python 多标签OneVsRestClassifier的GridSearch?,python,scikit-learn,Python,Scikit Learn,我正在对多标签数据进行网格搜索,如下所示: #imports from sklearn.svm import SVC as classifier from sklearn.pipeline import Pipeline from sklearn.decomposition import RandomizedPCA from sklearn.cross_validation import StratifiedKFold from sklearn.grid_search import GridSe
#imports
from sklearn.svm import SVC as classifier
from sklearn.pipeline import Pipeline
from sklearn.decomposition import RandomizedPCA
from sklearn.cross_validation import StratifiedKFold
from sklearn.grid_search import GridSearchCV
#classifier pipeline
clf_pipeline = clf_pipeline = OneVsRestClassifier(
Pipeline([('reduce_dim', RandomizedPCA()),
('clf', classifier())
]
))
C_range = 10.0 ** np.arange(-2, 9)
gamma_range = 10.0 ** np.arange(-5, 4)
n_components_range = (10, 100, 200)
degree_range = (1, 2, 3, 4)
param_grid = dict(estimator__clf__gamma=gamma_range,
estimator__clf__c=c_range,
estimator__clf__degree=degree_range,
estimator__reduce_dim__n_components=n_components_range)
grid = GridSearchCV(clf_pipeline, param_grid,
cv=StratifiedKFold(y=Y, n_folds=3), n_jobs=1,
verbose=2)
grid.fit(X, Y)
我看到了以下回溯:
/Users/andrewwinterman/Documents/sparks-honey/classifier/lib/python2.7/site-packages/sklearn/grid_search.pyc in fit_grid_point(X, y, base_clf, clf_params, train, test, loss_func, score_func, verbose, **fit_params)
107
108 if y is not None:
--> 109 y_test = y[safe_mask(y, test)]
110 y_train = y[safe_mask(y, train)]
111 clf.fit(X_train, y_train, **fit_params)
TypeError: only integer arrays with one element can be converted to an index
看起来像多个标签上的GridSearchCV对象。我应该如何解决这个问题?我是否需要使用label_binarizer显式迭代唯一类,在每个子估计器上运行网格搜索 我认为grid_search.py中有一个bug 您是否尝试将
y
作为numpy数组
import numpy as np
Y = np.asarray(Y)
您使用的是0.12.1还是0.13?我认为当升级到0.13时,问题应该会消失。我使用的是0.13的开发分支。我很快会再试一次。它应该在0.13版本和当前主版本中工作。如果没有,请在github上打开一个问题。它在0.13.1上仍然不起作用。不,我实际上已经停止使用scikit learn。你必须亲自尝试任何建议的解决方案。如果你能证明其中一个有效,我会接受:)@ZéRicardo,很高兴听到这个:)