Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/284.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Scikit学习网格搜索给予“;ValueError:不支持多类格式";错误_Python_Machine Learning_Scikit Learn - Fatal编程技术网

Python Scikit学习网格搜索给予“;ValueError:不支持多类格式";错误

Python Scikit学习网格搜索给予“;ValueError:不支持多类格式";错误,python,machine-learning,scikit-learn,Python,Machine Learning,Scikit Learn,我尝试使用GridSearch对LinearSVC()进行参数估计,如下所示- clf_SVM = LinearSVC() params = { 'C': [0.5, 1.0, 1.5], 'tol': [1e-3, 1e-4, 1e-5], 'multi_class': ['ovr', 'crammer_singer'], } gs = GridSearchCV(clf_SVM, params, cv=5, sco

我尝试使用GridSearch对LinearSVC()进行参数估计,如下所示-

clf_SVM = LinearSVC()
params = {
          'C': [0.5, 1.0, 1.5],
          'tol': [1e-3, 1e-4, 1e-5],
          'multi_class': ['ovr', 'crammer_singer'],
          }
gs = GridSearchCV(clf_SVM, params, cv=5, scoring='roc_auc')
gs.fit(corpus1, y)
小体1有形状(17267001),y有形状(1726,)

这是一个多类分类,y的值从0到3,两者都包括在内,即有四个类

但这给了我以下的错误-

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-220-0c627bda0543> in <module>()
      5           }
      6 gs = GridSearchCV(clf_SVM, params, cv=5, scoring='roc_auc')
----> 7 gs.fit(corpus1, y)

/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.pyc in fit(self, X, y)
    594 
    595         """
--> 596         return self._fit(X, y, ParameterGrid(self.param_grid))
    597 
    598 

/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.pyc in _fit(self, X, y, parameter_iterable)
    376                                     train, test, self.verbose, parameters,
    377                                     self.fit_params, return_parameters=True)
--> 378             for parameters in parameter_iterable
    379             for train, test in cv)
    380 

/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
    651             self._iterating = True
    652             for function, args, kwargs in iterable:
--> 653                 self.dispatch(function, args, kwargs)
    654 
    655             if pre_dispatch == "all" or n_jobs == 1:

/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.pyc in dispatch(self, func, args, kwargs)
    398         """
    399         if self._pool is None:
--> 400             job = ImmediateApply(func, args, kwargs)
    401             index = len(self._jobs)
    402             if not _verbosity_filter(index, self.verbose):

/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.pyc in __init__(self, func, args, kwargs)
    136         # Don't delay the application, to avoid keeping the input
    137         # arguments in memory
--> 138         self.results = func(*args, **kwargs)
    139 
    140     def get(self):

/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters)
   1238     else:
   1239         estimator.fit(X_train, y_train, **fit_params)
-> 1240     test_score = _score(estimator, X_test, y_test, scorer)
   1241     if return_train_score:
   1242         train_score = _score(estimator, X_train, y_train, scorer)

/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.pyc in _score(estimator, X_test, y_test, scorer)
   1294         score = scorer(estimator, X_test)
   1295     else:
-> 1296         score = scorer(estimator, X_test, y_test)
   1297     if not isinstance(score, numbers.Number):
   1298         raise ValueError("scoring must return a number, got %s (%s) instead."

/usr/local/lib/python2.7/dist-packages/sklearn/metrics/scorer.pyc in __call__(self, clf, X, y)
    136         y_type = type_of_target(y)
    137         if y_type not in ("binary", "multilabel-indicator"):
--> 138             raise ValueError("{0} format is not supported".format(y_type))
    139 
    140         try:

ValueError: multiclass format is not supported
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
5           }
6 gs=GridSearchCV(clf_SVM,参数,cv=5,评分='roc_auc')
---->7 gs.fit(小体1,y)
/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.pyc in-fit(self,X,y)
594
595         """
-->596返回自拟合(X,y,参数网格(自参数网格))
597
598
/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.pyc in_fit(self,X,y,parameter_iterable)
376列车、试验、自详细、参数、,
377 self.fit_参数,返回_参数=真)
-->378用于参数_iterable中的参数
379用于列车,在cv中进行试验)
380
/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.pyc in_______调用(self,iterable)
651自迭代=真
652对于iterable中的函数、参数和kwargs:
-->653自动调度(功能、参数、kwargs)
654
655如果预调度==“所有”或n个作业==1:
/分派中的usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.pyc(self、func、args、kwargs)
398         """
399如果self.\u池为无:
-->400作业=立即应用(func、args、kwargs)
401索引=len(自作业)
402如果不是详细过滤器(索引,self.verbose):
/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.pyc in_u_________(self、func、args、kwargs)
136#不要延迟应用程序,以免保留输入
137#内存中的参数
-->138 self.results=func(*args,**kwargs)
139
140 def get(自我):
/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.pyc in_fit_和_分数(估计器、X、y、计分器、训练、测试、详细、参数、拟合参数、返回训练分数、返回参数)
1238其他:
1239估算器拟合(X_序列、y_序列、**拟合参数)
->1240测试分数=_分数(估计员、X测试、y测试、计分员)
1241如果返回列车评分:
1242训练分数=_分数(估计员、X训练、y训练、计分员)
/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.pyc in_分数(估计器、X_测试、y_测试、计分器)
1294分=记分员(估计员,X_检验)
1295其他:
->1296分=记分员(估计员、X_检验、y_检验)
1297如果不存在(分数、数字、数字):
1298 raise VALUERROR(“评分必须返回一个数字,取而代之的是%s(%s)。”
/usr/local/lib/python2.7/dist-packages/sklearn/metrics/scorer.pyc in___调用(self,clf,X,y)
136 y_类型=_目标的类型(y)
137如果y_类型不在(“二进制”、“多标签指示器”):
-->138 raise VALUERROR(“{0}格式不受支持”。格式(y_类型))
139
140试试:
ValueError:不支持多类格式
来自:

“注意:此实现仅限于标签指示器格式的二进制分类任务或多标签分类任务。”

尝试:


训练前。这将对您的y执行“一次热”编码。

如前所述,您必须首先对y进行二值化

y = label_binarize(y, classes=[0, 1, 2, 3])
然后使用多类学习算法,如
OneVsRestClassifier
OneVsOneClassifier
。例如:

clf_SVM = OneVsRestClassifier(LinearSVC())
params = {
      'estimator__C': [0.5, 1.0, 1.5],
      'estimator__tol': [1e-3, 1e-4, 1e-5],
      }
gs = GridSearchCV(clf_SVM, params, cv=5, scoring='roc_auc')
gs.fit(corpus1, y)

删除
scoring='roc\u auc'
,它将起到
roc\u auc
曲线不支持分类数据的作用。

您可以直接使用
进行分类
而不是
预处理。label\u binarize()
取决于您的问题。问题实际上来自于使用计分=
roc\u auc
。请注意,
roc\u auc
不支持分类数据。

您可以打印中使用的变量的形状吗。fitcorpus1具有形状(17267001)而y具有形状(1726,)我也有同样的问题,在使用“roc_auc”评分机制时,我使用了“精确性”并成功了。谢谢,但现在我检查了y和小体1的形状,它们是(1726,4)和(1726,7001)你的形状现在是(1380,4)?转换后的y应该是(1726,4)您的y变量中是否存在所有4个类?是的,请参见此处的前30行-@user1269942我没有看到此“注意:此实现仅限于标签指示器格式的二进制分类任务或多标签分类任务”。关于此问题。您能解释一下我应该在哪里查找吗?
clf_SVM = OneVsRestClassifier(LinearSVC())
params = {
      'estimator__C': [0.5, 1.0, 1.5],
      'estimator__tol': [1e-3, 1e-4, 1e-5],
      }
gs = GridSearchCV(clf_SVM, params, cv=5, scoring='roc_auc')
gs.fit(corpus1, y)