Python Xgboost未使用校准分类器运行_Python_Machine Learning_Xgboost

Python Xgboost未使用校准分类器运行

python machine-learning

Python Xgboost未使用校准分类器运行,python,machine-learning,xgboost,Python,Machine Learning,Xgboost,我正在尝试使用校准的分类器运行XGboost，下面是我遇到错误的代码片段： from sklearn.calibration import CalibratedClassifierCV from xgboost import XGBClassifier import numpy as np x_train =np.array([1,2,2,3,4,5,6,3,4,10,]).reshape(-1,1) y_train = np.array([1,1,1,1,1,3,3,3,3,3]) x_c

我正在尝试使用校准的分类器运行XGboost，下面是我遇到错误的代码片段：

from sklearn.calibration import CalibratedClassifierCV
from xgboost import XGBClassifier
import numpy as np

x_train =np.array([1,2,2,3,4,5,6,3,4,10,]).reshape(-1,1)
y_train = np.array([1,1,1,1,1,3,3,3,3,3])

x_cfl=XGBClassifier(n_estimators=1)
x_cfl.fit(x_train,y_train)
sig_clf = CalibratedClassifierCV(x_cfl, method="sigmoid")
sig_clf.fit(x_train, y_train)

错误：

TypeError: predict_proba() got an unexpected keyword argument 'X'"

完整跟踪：

TypeError                                Traceback (most recent call last)
<ipython-input-48-08dd0b4ae8aa> in <module>
----> 1 sig_clf.fit(x_train, y_train)

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in fit(self, X, y, sample_weight)
    309                 parallel = Parallel(n_jobs=self.n_jobs)
    310 
--> 311                 self.calibrated_classifiers_ = parallel(
    312                     delayed(_fit_classifier_calibrator_pair)(
    313                         clone(base_estimator), X, y, train=train, test=test,

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self, iterable)
   1039             # remaining jobs.
   1040             self._iterating = False
-> 1041             if self.dispatch_one_batch(iterator):
   1042                 self._iterating = self._original_iterator is not None
   1043 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in dispatch_one_batch(self, iterator)
    857                 return False
    858             else:
--> 859                 self._dispatch(tasks)
    860                 return True
    861 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in _dispatch(self, batch)
    775         with self._lock:
    776             job_idx = len(self._jobs)
--> 777             job = self._backend.apply_async(batch, callback=cb)
    778             # A job can complete so quickly than its callback is
    779             # called before we get here, causing self._jobs to

~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in apply_async(self, func, callback)
    206     def apply_async(self, func, callback=None):
    207         """Schedule a func to be run"""
--> 208         result = ImmediateResult(func)
    209         if callback:
    210             callback(result)

~/anaconda3/lib/python3.8/site-packages/joblib/_parallel_backends.py in __init__(self, batch)
    570         # Don't delay the application, to avoid keeping the input
    571         # arguments in memory
--> 572         self.results = batch()
    573 
    574     def get(self):

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in __call__(self)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264 

~/anaconda3/lib/python3.8/site-packages/joblib/parallel.py in <listcomp>(.0)
    260         # change the default number of processes to -1
    261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
--> 262             return [func(*args, **kwargs)
    263                     for func, args, kwargs in self.items]
    264 

~/anaconda3/lib/python3.8/site-packages/sklearn/utils/fixes.py in __call__(self, *args, **kwargs)
    220     def __call__(self, *args, **kwargs):
    221         with config_context(**self.config):
--> 222             return self.function(*args, **kwargs)

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _fit_classifier_calibrator_pair(estimator, X, y, train, test, supports_sw, method, classes, sample_weight)
    443     n_classes = len(classes)
    444     pred_method = _get_prediction_method(estimator)
--> 445     predictions = _compute_predictions(pred_method, X[test], n_classes)
    446 
    447     sw = None if sample_weight is None else sample_weight[test]

~/anaconda3/lib/python3.8/site-packages/sklearn/calibration.py in _compute_predictions(pred_method, X, n_classes)
    499         (X.shape[0], 1).
    500     """
--> 501     predictions = pred_method(X=X)
    502     if hasattr(pred_method, '__name__'):
    503         method_name = pred_method.__name__

TypeError: predict_proba() got an unexpected keyword argument 'X'

输出：

CalibratedClassifierCV(base_estimator=LGBMClassifier(n_estimators=1))

我的Xgboost安装有问题吗？？我使用conda进行安装，我记得我昨天卸载了xgboost并再次安装了它

我的xgboost版本：

1.3.0

现在已经修复了，好像scikit learn=0.24中有一个bug

我降级到0.22.2.post1，它被修复了

我认为问题来自XGBoost。这里解释如下：

XGBoost已定义：

预测概率（自我、数据等）

而不是：

预测概率（self，X，…

由于sklearn 0.24调用了

clf.predict_proba（X=X）

，因此引发了一个异常

下面是一个在不更改包版本的情况下解决问题的方法：创建一个继承

XGBoostClassifier

的类，用正确的参数名覆盖

predict\u proba

，并调用

super（）

请接受您自己的答案，以便它在将来对其他人有明显的帮助。如果您已经识别了这个bug，那么scikit learn的github是否存在问题？是的，在PR线程之后出现了一个bug，很好，看起来xgboost版本>=1.3.2（sklearn any版本）的修复程序。我在catboost（0.24.4）中得到了完全相同的错误：TypeError:predict_proba（）得到一个意外的关键字参数“X”。你知道如何修复它吗？@6761646f6e你有从XGBoostClassifier继承的新类的模板吗？您好，我从XGBoostClassifier继承的新类包含：

def predict_proba（self，X，…）：return super（CustomXGBClassifier，self）。predict_proba（X，…）

。我还注意到

kwargs

属性存在问题，如果需要

sklearn.base.clone

模型，则需要覆盖

get_params

。

CalibratedClassifierCV(base_estimator=LGBMClassifier(n_estimators=1))