Python Xgboost未使用校准分类器运行

Python Xgboost未使用校准分类器运行,python,machine-learning,xgboost,Python,Machine Learning,Xgboost,我正在尝试使用校准的分类器运行XGboost,下面是我遇到错误的代码片段: from sklearn.calibration import CalibratedClassifierCV from xgboost import XGBClassifier import numpy as np x_train =np.array([1,2,2,3,4,5,6,3,4,10,]).reshape(-1,1) y_train = np.array([1,1,1,1,1,3,3,3,3,3]) x_c

我正在尝试使用校准的分类器运行XGboost,下面是我遇到错误的代码片段:

from sklearn.calibration import CalibratedClassifierCV
from xgboost import XGBClassifier
import numpy as np

x_train =np.array([1,2,2,3,4,5,6,3,4,10,]).reshape(-1,1)
y_train = np.array([1,1,1,1,1,3,3,3,3,3])

x_cfl=XGBClassifier(n_estimators=1)
x_cfl.fit(x_train,y_train)
sig_clf = CalibratedClassifierCV(x_cfl, method="sigmoid")
sig_clf.fit(x_train, y_train)

错误:

TypeError: predict_proba() got an unexpected keyword argument 'X'"
完整跟踪:

TypeError                                Traceback (most recent call last)
<ipython-input-48-08dd0b4ae8aa> in <module>
----> 1 sig_clf.fit(x_train, y_train)

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in fit(self, X, y, sample_weight)
    309                 parallel = Parallel(n_jobs=self.n_jobs)
    310 
--> 311                 self.calibrated_classifiers_ = parallel(
    312                     delayed(_fit_classifier_calibrator_pair)(
    313                         clone(base_estimator), X, y, train=train, test=test,

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self, iterable)
   1039             # remaining jobs.
   1040             self._iterating = False
-> 1041             if self.dispatch_one_batch(iterator):
   1042                 self._iterating = self._original_iterator is not None
   1043 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
    857                 return False
    858             else:
--> 859                 self._dispatch(tasks)
    860                 return True
    861 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in _dispatch(self, batch)
    775         with self._lock:
    776             job_idx = len(self._jobs)
--> 777             job = self._backend.apply_async(batch, callback=cb)
    778             # A job can complete so quickly than its callback is
    779             # called before we get here, causing self._jobs to

~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)
    206     def apply_async(self, func, callback=None):
    207         """Schedule a func to be run"""
--> 208         result = ImmediateResult(func)
    209         if callback:
    210             callback(result)

~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in __init__(self, batch)
    570         # Don't delay the application, to avoid keeping the input
    571         # arguments in memory
--> 572         self.results = batch()
    573 
    574     def get(self):

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in <listcomp>(.0)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264 

~/anaconda3/lib/python3.8/site-packages/sklearn/utils/fixes.py in __call__(self, *args, **kwargs)
    220     def __call__(self, *args, **kwargs):
    221         with config_context(**self.config):
--> 222             return self.function(*args, **kwargs)

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _fit_classifier_calibrator_pair(estimator, X, y, train, test, supports_sw, method, classes, sample_weight)
    443     n_classes = len(classes)
    444     pred_method = _get_prediction_method(estimator)
--> 445     predictions = _compute_predictions(pred_method, X[test], n_classes)
    446 
    447     sw = None if sample_weight is None else sample_weight[test]

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _compute_predictions(pred_method, X, n_classes)
    499         (X.shape[0], 1).
    500     """
--> 501     predictions = pred_method(X=X)
    502     if hasattr(pred_method, '__name__'):
    503         method_name = pred_method.__name__

TypeError: predict_proba() got an unexpected keyword argument 'X'

输出:

CalibratedClassifierCV(base_estimator=LGBMClassifier(n_estimators=1))

我的Xgboost安装有问题吗??我使用conda进行安装,我记得我昨天卸载了xgboost并再次安装了它

我的xgboost版本:

1.3.0


现在已经修复了,好像scikit learn=0.24中有一个bug


我降级到0.22.2.post1,它被修复了

我认为问题来自XGBoost。 这里解释如下:

XGBoost已定义:

预测概率(自我、数据等)

而不是:

预测概率(self,X,…

由于sklearn 0.24调用了
clf.predict_proba(X=X)
,因此引发了一个异常


下面是一个在不更改包版本的情况下解决问题的方法:创建一个继承
XGBoostClassifier
的类,用正确的参数名覆盖
predict\u proba
,并调用
super()

请接受您自己的答案,以便它在将来对其他人有明显的帮助。如果您已经识别了这个bug,那么scikit learn的github是否存在问题?是的,在PR线程之后出现了一个bug,很好,看起来xgboost版本>=1.3.2(sklearn any版本)的修复程序。我在catboost(0.24.4)中得到了完全相同的错误:TypeError:predict_proba()得到一个意外的关键字参数“X”。你知道如何修复它吗?@6761646f6e你有从XGBoostClassifier继承的新类的模板吗?您好,我从XGBoostClassifier继承的新类包含:
def predict_proba(self,X,…):return super(CustomXGBClassifier,self)。predict_proba(X,…)
。我还注意到
kwargs
属性存在问题,如果需要
sklearn.base.clone
模型,则需要覆盖
get_params
CalibratedClassifierCV(base_estimator=LGBMClassifier(n_estimators=1))