Scikit learn 如何在scikit learn中的LogisticRegressionCV中实现不同的评分功能?

Scikit learn 如何在scikit learn中的LogisticRegressionCV中实现不同的评分功能?,scikit-learn,logistic-regression,Scikit Learn,Logistic Regression,我正在尝试从scikit learn 0.16实现LogisticRegressionCV类,但很难使用不同的评分函数。文档要求传入sklearn.metrics中的一个评分函数,因此我尝试了以下代码: from sklearn.linear_model import LogisticRegressionCV from sklearn.metrics import log_loss ... model_regression = LogisticRegressionCV(scoring=log

我正在尝试从scikit learn 0.16实现LogisticRegressionCV类,但很难使用不同的评分函数。文档要求传入sklearn.metrics中的一个评分函数,因此我尝试了以下代码:

from sklearn.linear_model import LogisticRegressionCV
from sklearn.metrics import log_loss

...

model_regression = LogisticRegressionCV(scoring=log_loss)
model_regression.fit(data_combined, winners_losers)
但是,我在fit函数中遇到以下错误:

  File "C:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py", line 1381, in fit
    for label in iter_labels
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 659, in __call__
    self.dispatch(function, args, kwargs)
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 406, in dispatch
    job = ImmediateApply(func, args, kwargs)
  File "C:\Anaconda3\lib\site-packages\sklearn\externals\joblib\parallel.py", line 140, in __init__
    self.results = func(*args, **kwargs)
  File "C:\Anaconda3\lib\site-packages\sklearn\linear_model\logistic.py", line 844, in _log_reg_scoring_path
    scores.append(scoring(log_reg, X_test, y_test))
  File "C:\Anaconda3\lib\site-packages\sklearn\metrics\classification.py", line 1403, in log_loss
    T = lb.fit_transform(y_true)
  File "C:\Anaconda3\lib\site-packages\sklearn\base.py", line 433, in fit_transform
    return self.fit(X, **fit_params).transform(X)
  File "C:\Anaconda3\lib\site-packages\sklearn\preprocessing\label.py", line 315, in fit
    self.y_type_ = type_of_target(y)
  File "C:\Anaconda3\lib\site-packages\sklearn\utils\multiclass.py", line 287, in type_of_target
    'got %r' % y)
ValueError: Expected array-like (array or non-string sequence), got LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr',
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0)

我做错了什么?如果没有'scoring=log\u loss'参数,那么函数工作正常,因此它必须与我传递函数的方式有关。

它应该是
scoring=“neg\u log\u loss”
,一个字符串,而不是函数。如果您想传递一个可调用函数,它需要有一个不同的接口。看。可调用函数应该包含三个参数:拟合的估计器、要评分的数据(X)和已知的真实目标(y)。

要提供函数,您需要make_scorer包装器

import sklearn.metrics 

scorefunc = sklearn.metrics.accuracy_score  # Replace with custom
myscorer = sklearn.metrics.make_scorer(
         scorefunc,
         greater_is_better=True,
         needs_threshold=False # ... classification
)

LogisticRegressionCV(... scoring=myscorer,)

。。。。作为旁注,如果sklearn的LogisticRegression主要是回归,并且一个新的LogisticClassification类将其包装起来,那就太好了。目前不可能提供回归误差,也不可能提供实值目标。(啊哈)嗯。。。。文档中显示,字符串、可调用或无。我在传递可调用文件时也会遇到此错误。是的,但不是任意可调用文件,而是符合我链接到的文档中指定的接口的可调用文件。我编辑了我的答案来总结文档。“日志丢失”不起作用。应该是“neg_log_loss”。谢谢,在我发布后的5年里,我们确实改变了这一点;)