Python 尝试运行xgboost.predict或xgboost.score时出现奇怪错误
我试图在没有任何缺失数据的数据集上运行xgboost回归器模型Python 尝试运行xgboost.predict或xgboost.score时出现奇怪错误,python,xgboost,predict,Python,Xgboost,Predict,我试图在没有任何缺失数据的数据集上运行xgboost回归器模型 # Run GBM on training dataset # Create xgboost object pts_xgb = xgb.XGBRegressor(objective="reg:squarederror", missing=None, seed=42) # Fit xgboost onto data pts_xgb.fit(X_train ,y_train ,verbose=Tru
# Run GBM on training dataset
# Create xgboost object
pts_xgb = xgb.XGBRegressor(objective="reg:squarederror", missing=None, seed=42)
# Fit xgboost onto data
pts_xgb.fit(X_train
,y_train
,verbose=True
,early_stopping_rounds=10
,eval_metric='rmse'
,eval_set=[(X_test,y_test)])
模型创建似乎工作正常,我使用以下方法确认X_train和y_train没有空值:
print(X_train.isnull().values.sum()) # prints 0
print(y_train.isnull().values.sum()) # prints 0
但是当我运行下面的代码时,我得到了下面的错误
代码:
错误:
---------------------------------------------------------------------------
XGBoostError Traceback (most recent call last)
<ipython-input-37-39b223d418b2> in <module>
----> 1 pts_xgb.score(X_train_test,y_train_test)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/base.py in score(self, X, y, sample_weight)
551
552 from .metrics import r2_score
--> 553 y_pred = self.predict(X)
554 return r2_score(y, y_pred, sample_weight=sample_weight)
555
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/sklearn.py in predict(self, X, output_margin, ntree_limit, validate_features, base_margin, iteration_range)
818 if self._can_use_inplace_predict():
819 try:
--> 820 predts = self.get_booster().inplace_predict(
821 data=X,
822 iteration_range=iteration_range,
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/core.py in inplace_predict(self, data, iteration_range, predict_type, missing, validate_features, base_margin, strict_shape)
1844 from .data import _maybe_np_slice
1845 data = _maybe_np_slice(data, data.dtype)
-> 1846 _check_call(
1847 _LIB.XGBoosterPredictFromDense(
1848 self.handle,
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/core.py in _check_call(ret)
208 """
209 if ret != 0:
--> 210 raise XGBoostError(py_str(_LIB.XGBGetLastError()))
211
212
XGBoostError: [09:18:58] /Users/travis/build/dmlc/xgboost/src/c_api/c_api_utils.h:157: Invalid missing value: null
Stack trace:
[bt] (0) 1 libxgboost.dylib 0x000000011e4e7064 dmlc::LogMessageFatal::~LogMessageFatal() + 116
[bt] (1) 2 libxgboost.dylib 0x000000011e4d9afc xgboost::GetMissing(xgboost::Json const&) + 268
[bt] (2) 3 libxgboost.dylib 0x000000011e4e0a13 void InplacePredictImpl<xgboost::data::ArrayAdapter>(std::__1::shared_ptr<xgboost::data::ArrayAdapter>, std::__1::shared_ptr<xgboost::DMatrix>, char const*, xgboost::Learner*, unsigned long, unsigned long, unsigned long long const**, unsigned long long*, float const**) + 531
[bt] (3) 4 libxgboost.dylib 0x000000011e4e04d3 XGBoosterPredictFromDense + 339
[bt] (4) 5 libffi.dylib 0x00007fff2dc7f8e5 ffi_call_unix64 + 85
---------------------------------------------------------------------------
XGBoostError回溯(最近一次呼叫最后一次)
在里面
---->1分xgb.分数(X\U系列测试、y\U系列测试)
/分数中的Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/base.py(self,X,y,sample_weight)
551
552来自。指标导入r2_分数
-->553 y_pred=自我预测(X)
554返回r2_分数(y,y_pred,样本重量=样本重量)
555
/预测中的Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/sklearn.py(self、X、output\u margin、ntree\u limit、validate\u features、base\u margin、iteration\u range)
818如果self.\u可以使用\u in place\u predict():
819试试:
-->820 predts=self.get_booster().in place_predict(
821数据=X,
822迭代范围=迭代范围,
/库/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/core.py in-place\u predict(自我、数据、迭代范围、预测类型、缺失、验证\u功能、基本\u边距、严格\u形状)
1844从。数据导入\u可能\u np\u切片
1845 data=\u maybe\u np\u切片(data,data.dtype)
->1846(检查)电话(
1847年_LIB.xgboostepredictfrommense(
1848年的今天,
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/core.py in\u check\u call(ret)
208 """
209如果ret!=0:
-->210 raise XGBoosError(py_str(_LIB.XGBGetLastError()))
211
212
xGBoothror:[09:18:58]/Users/travis/build/dmlc/xgboost/src/c_-api/c_-api_-utils.h:157:无效缺少值:null
堆栈跟踪:
[bt](0)1 libxgboost.dylib 0x000000011e4e7064 dmlc::LogMessageFatal::~LogMessageFatal()+116
[bt](1)2 libxgboost.dylib 0x000000011e4d9afc xgboost::GetMissing(xgboost::Json const&)+268
[bt](2)3 libxgboost.dylib 0x000000011e4e0a13 void InplacePredictImpl(std::uu 1::shared_ptr,std:u 1::shared_ptr,char const*,xgboost::Learner*,unsigned long,unsigned long long const**,unsigned long long*,float const**)+531
[bt](3)4 libxgboost.dylib 0x000000011e4e04d3 xgboosterpredictfrommense+339
[bt](4)5 libffi.dylib 0x00007fff2dc7f8e5 ffi_调用unix64+85
如果我尝试运行pts\u xgb.predict(X\u train)
编辑:这不是X_列或y_列中任何缺失/空值的问题。我在使用以下数据集时遇到了相同的错误,该数据集比我的实际数据集小得多(见下文):
列车:
y_列车:
有人知道为什么会发生这种情况吗?我找不到任何其他论坛讨论同样的问题。这是一个缺少/空值的问题 代替
xgb.xgb回归器(objective=“reg:squaredrror”,missing=None,seed=42)
尝试xgb.xgb回归器(objective=“reg:squaredrror”,missing=1,seed=42)
因此,请参见以下内容的答案:不要求和,而是尝试使用count?如果这也不显示null,请尝试使用NVL或coalesce将null替换为字符串,并计算该字符串的实例。我尝试了一些不同的方法,但所有结果都显示为0个null/空白字段。我甚至导出到Excel(使用X_train.to_Excel(…)因为这让我感觉更舒服,我确认没有空白单元格,所有单元格都是一个数字。您是如何在Excel中确认的?我对每一列进行了COUNT()(只计算数值),并对每一列进行了COUNTBLANK(),以确认没有空白单元格。COUNT()返回每列数据的确切行数,COUNTBLANK()为每个列返回0确定,您可以尝试1件事,应用筛选器,下拉列表,并查看筛选器值,特别是列表末尾的筛选器值。可能是将空值转换为?或N/A或其他内容。为几个示例列尝试此操作。啊,好的,非常感谢您的帮助。您为什么建议将“missing=1”放在“尽管如此,我不是最好首先从xgb回归器中排除缺失的参数,因为我实际上没有缺失任何数据吗?如果您查看以下链接上的文档,您肯定可以做到这一点:“缺失(float,default np.nan)–数据中需要作为缺失值出现的值。”我认为问题在于,当您指定None时,它将其视为无效值。缺少的值可以是float或NaN(如果您排除参数)…它不能是stringyes如果没有任何缺少的值,最好删除该参数;因为它会隐式地表示数据集中没有任何缺少的值,因此在评分或预测时不会忽略任何行
---------------------------------------------------------------------------
XGBoostError Traceback (most recent call last)
<ipython-input-37-39b223d418b2> in <module>
----> 1 pts_xgb.score(X_train_test,y_train_test)
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/sklearn/base.py in score(self, X, y, sample_weight)
551
552 from .metrics import r2_score
--> 553 y_pred = self.predict(X)
554 return r2_score(y, y_pred, sample_weight=sample_weight)
555
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/sklearn.py in predict(self, X, output_margin, ntree_limit, validate_features, base_margin, iteration_range)
818 if self._can_use_inplace_predict():
819 try:
--> 820 predts = self.get_booster().inplace_predict(
821 data=X,
822 iteration_range=iteration_range,
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/core.py in inplace_predict(self, data, iteration_range, predict_type, missing, validate_features, base_margin, strict_shape)
1844 from .data import _maybe_np_slice
1845 data = _maybe_np_slice(data, data.dtype)
-> 1846 _check_call(
1847 _LIB.XGBoosterPredictFromDense(
1848 self.handle,
/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/xgboost/core.py in _check_call(ret)
208 """
209 if ret != 0:
--> 210 raise XGBoostError(py_str(_LIB.XGBGetLastError()))
211
212
XGBoostError: [09:18:58] /Users/travis/build/dmlc/xgboost/src/c_api/c_api_utils.h:157: Invalid missing value: null
Stack trace:
[bt] (0) 1 libxgboost.dylib 0x000000011e4e7064 dmlc::LogMessageFatal::~LogMessageFatal() + 116
[bt] (1) 2 libxgboost.dylib 0x000000011e4d9afc xgboost::GetMissing(xgboost::Json const&) + 268
[bt] (2) 3 libxgboost.dylib 0x000000011e4e0a13 void InplacePredictImpl<xgboost::data::ArrayAdapter>(std::__1::shared_ptr<xgboost::data::ArrayAdapter>, std::__1::shared_ptr<xgboost::DMatrix>, char const*, xgboost::Learner*, unsigned long, unsigned long, unsigned long long const**, unsigned long long*, float const**) + 531
[bt] (3) 4 libxgboost.dylib 0x000000011e4e04d3 XGBoosterPredictFromDense + 339
[bt] (4) 5 libffi.dylib 0x00007fff2dc7f8e5 ffi_call_unix64 + 85