Python 3.x xgboost错误:检查失败:!auc#U错误auc:数据集仅包含pos或neg样本';
我正在毫无问题地运行以下代码:Python 3.x xgboost错误:检查失败:!auc#U错误auc:数据集仅包含pos或neg样本';,python-3.x,xgboost,Python 3.x,Xgboost,我正在毫无问题地运行以下代码: churn_dmatrix = xgb.DMatrix(data = class_data.iloc[:, :-1], label = class_data.Churn) params = {'objective' : 'binary:logistic' , 'max_depth' : 4} cv_results = xgb.cv(dtrain = churn_dmatrix, params = params, nfold = 4, num_boost_round
churn_dmatrix = xgb.DMatrix(data = class_data.iloc[:, :-1], label = class_data.Churn)
params = {'objective' : 'binary:logistic' , 'max_depth' : 4}
cv_results = xgb.cv(dtrain = churn_dmatrix, params = params, nfold = 4, num_boost_round = 1, metrics = 'error', \
as_pandas = True)
print(cv_results)
train-error-mean train-error-std test-error-mean test-error-std
0 0.395833 0.108253 0.375 0.414578
但是,当我将度量更改为“auc”时,会收到一条错误消息:
cv_results = xgb.cv(dtrain = churn_dmatrix, params = params, nfold = 4, num_boost_round = 5, metrics = 'auc', \
as_pandas = True)
---------------------------------------------------------------------------
XGBoostError Traceback (most recent call last)
<ipython-input-102-ea99ef0705b5> in <module>()
----> 1 cv_results = xgb.cv(dtrain = churn_dmatrix, params = params, nfold = 4, num_boost_round = 5, metrics = 'auc', as_pandas = True)
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\training.py in cv(params, dtrain, num_boost_round, nfold, stratified, folds, metrics, obj, feval, maximize, early_stopping_rounds, fpreproc, as_pandas, verbose_eval, show_stdv, seed, callbacks, shuffle)
405 for fold in cvfolds:
406 fold.update(i, obj)
--> 407 res = aggcv([f.eval(i, feval) for f in cvfolds])
408
409 for key, mean, std in res:
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\training.py in <listcomp>(.0)
405 for fold in cvfolds:
406 fold.update(i, obj)
--> 407 res = aggcv([f.eval(i, feval) for f in cvfolds])
408
409 for key, mean, std in res:
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\training.py in eval(self, iteration, feval)
220 def eval(self, iteration, feval):
221 """"Evaluate the CVPack for one iteration."""
--> 222 return self.bst.eval_set(self.watchlist, iteration, feval)
223
224
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\core.py in eval_set(self, evals, iteration, feval)
953 dmats, evnames,
954 c_bst_ulong(len(evals)),
--> 955 ctypes.byref(msg)))
956 res = msg.value.decode()
957 if feval is not None:
C:\ProgramData\Anaconda3\lib\site-packages\xgboost\core.py in _check_call(ret)
128 """
129 if ret != 0:
--> 130 raise XGBoostError(_LIB.XGBGetLastError())
131
132
XGBoostError: b'[14:27:23] src/metric/rank_metric.cc:135: Check failed: !auc_error AUC: the dataset only contains pos or neg samples'
cv\u results=xgb.cv(dtrain=chorn\u dmatrix,params=params,nfold=4,num\u boost\u round=5,metrics='auc'\
as_=正确)
---------------------------------------------------------------------------
XGBoostError回溯(最近一次呼叫最后一次)
在()
---->1 cv\u results=xgb.cv(dtrain=chorn\u dmatrix,params=params,nfold=4,num\u boost\u round=5,metrics='auc',as\u pandas=True)
C:\ProgramData\Anaconda3\lib\site packages\xgboost\training.py在cv中(参数、数据训练、数值推进、nfold、分层、折叠、度量、obj、feval、最大化、提前停止、FPREPROCC、大熊猫、详细评估、显示stdv、种子、回调、洗牌)
405对于cvfolds中的折叠:
406倍。更新(i,obj)
-->407 res=aggcv([f.eval(i,feval)表示f在cvfolds中的值])
408
409对于键、平均值、标准值,以res表示:
C:\ProgramData\Anaconda3\lib\site packages\xgboost\training.py in(.0)
405对于cvfolds中的折叠:
406倍。更新(i,obj)
-->407 res=aggcv([f.eval(i,feval)表示f在cvfolds中的值])
408
409对于键、平均值、标准值,以res表示:
C:\ProgramData\Anaconda3\lib\site packages\xgboost\training.py in eval(self、iteration、feval)
220 def评估(自我、迭代、feval):
221“”为一次迭代评估CVPack。”“”
-->222返回self.bst.eval_集(self.watchlist,迭代,feval)
223
224
评估集中的C:\ProgramData\Anaconda3\lib\site packages\xgboost\core.py(self、evals、iteration、feval)
953个DMAT、evnames、,
954 c_bst_ulong(len(evals)),
-->955 ctypes.byref(msg)))
956 res=msg.value.decode()
957如果feval不是None:
C:\ProgramData\Anaconda3\lib\site packages\xgboost\core.py in\u check\u调用(ret)
128 """
129如果ret!=0:
-->130引发XGBoosError(_LIB.XGBGetLastError())
131
132
xGBoothror:b'[14:27:23]src/metric/rank_metric.cc:135:检查失败:!auc_错误auc:数据集仅包含pos或neg样本的
似乎所有的预测都是正面或负面的。我说得对吗?有什么我可以做的吗?当xgboost尝试拆分为训练/验证时,问题会出现,并且在其中一个拆分中,它没有否定或肯定的示例(在训练集中或验证集中) 我发现您可以采取两种快速方法:
99/1
的产物)