Python 尝试从XgBoost显示形状值时predict()出现问题?
我已经运行了一个XgBoost模型,我想显示预测的形状值。当我为SHAP创建变量时,我遇到了一个错误。以下是我已经安装的XgBoost型号的代码:Python 尝试从XgBoost显示形状值时predict()出现问题?,python,xgboost,Python,Xgboost,我已经运行了一个XgBoost模型,我想显示预测的形状值。当我为SHAP创建变量时,我遇到了一个错误。以下是我已经安装的XgBoost型号的代码: reg = xgb.XGBRegressor(n_estimators=1000) reg.fit(train_X, train_y, eval_set=[(train_X, train_y), (test_X, test_y)], early_stopping_rounds=50, verbose=False) df_com
reg = xgb.XGBRegressor(n_estimators=1000)
reg.fit(train_X, train_y,
eval_set=[(train_X, train_y), (test_X, test_y)],
early_stopping_rounds=50,
verbose=False)
df_compare = pd.DataFrame(data=test_y, columns = ["actual"])
df_compare['predicted'] = reg.predict(test_X)
# Model Evaluation
sqrt(mean_squared_error(y_true=df_compare['actual'],
y_pred=df_compare['predicted']))
# load JS visualization code to notebook
shap.initjs()
explainer = shap.TreeExplainer(reg)
shap_values = explainer.shap_values(test_X)
# summarize the effects of all the features
shap.summary_plot(shap_values, test_X)
错误将在shap_values变量上突出显示,错误代码为:
TypeError: predict() got an unexpected keyword argument 'validate_features'
我的目标是在测试集上显示个体预测因子的贡献。这可以通过基于slundberg的GitHub存储库的“shap.summary_plot()”命令来完成
从我最初的研究来看,这似乎是XgBoost的一个常见问题,我想知道是否有人能解决这个问题
任何帮助都会很好
编辑:以下是当前模式中的test_X示例:
array([[6.13181152e-01, 1.65250069e-01, 6.28375079e-01, 1.65250069e-01,
7.69355058e-01, 1.65250069e-01, 4.00000000e+00, 1.20000000e+01,
2.01300000e+03],
[6.25013774e-01, 1.50569938e-01, 6.40500901e-01, 1.50569938e-01,
7.84201386e-01, 1.50569938e-01, 1.00000000e+00, 1.00000000e+00,
2.01400000e+03],
[6.35163552e-01, 1.33475880e-01, 6.50902178e-01, 1.33475880e-01,
7.96936256e-01, 1.33475880e-01, 1.00000000e+00, 2.00000000e+00,
2.01400000e+03],
[6.46226644e-01, 1.09757193e-01, 6.62239401e-01, 1.09757193e-01,
8.10817057e-01, 1.09757193e-01, 1.00000000e+00, 3.00000000e+00,
2.01400000e+03],
[6.59526768e-01, 8.31406390e-02, 6.75869086e-01, 8.31406390e-02,
8.27504651e-01, 8.31406390e-02, 2.00000000e+00, 4.00000000e+00,
2.01400000e+03],
[6.75320666e-01, 6.19388504e-02, 6.92054339e-01, 6.19388504e-02,
8.47321169e-01, 6.19388504e-02, 2.00000000e+00, 5.00000000e+00,
2.01400000e+03],
[6.93341542e-01, 5.11984019e-02, 7.10521752e-01, 5.11984019e-02,
8.69931864e-01, 5.11984019e-02, 2.00000000e+00, 6.00000000e+00,
2.01400000e+03],
[7.10885315e-01, 4.83581090e-02, 7.28500240e-01, 4.83581090e-02,
8.91943941e-01, 4.83581090e-02, 3.00000000e+00, 7.00000000e+00,
2.01400000e+03],
[7.24623815e-01, 4.81424976e-02, 7.42579164e-01, 4.81424976e-02,
9.09181562e-01, 4.81424976e-02, 3.00000000e+00, 8.00000000e+00,
2.01400000e+03],
[7.32223979e-01, 4.68193402e-02, 7.50367651e-01, 4.68193402e-02,
9.18717446e-01, 4.68193402e-02, 3.00000000e+00, 9.00000000e+00,
2.01400000e+03],
[7.36887811e-01, 4.51536143e-02, 7.55147047e-01, 4.51536143e-02,
9.24569131e-01, 4.51536143e-02, 4.00000000e+00, 1.00000000e+01,
2.01400000e+03],
[7.43107813e-01, 4.53410592e-02, 7.61521174e-01, 4.53410592e-02,
9.32373334e-01, 4.53410592e-02, 4.00000000e+00, 1.10000000e+01,
2.01400000e+03],
[7.53861886e-01, 4.90621338e-02, 7.72541721e-01, 4.90621338e-02,
9.45866411e-01, 4.90621338e-02, 4.00000000e+00, 1.20000000e+01,
2.01400000e+03],
[7.67586715e-01, 5.63629131e-02, 7.86606635e-01, 5.63629131e-02,
9.63086879e-01, 5.63629131e-02, 1.00000000e+00, 1.00000000e+00,
2.01500000e+03],
[7.80160005e-01, 6.59919566e-02, 7.99491477e-01, 6.59919566e-02,
9.78862518e-01, 6.59919566e-02, 1.00000000e+00, 2.00000000e+00,
2.01500000e+03],
[7.89674219e-01, 7.78638363e-02, 8.09241442e-01, 7.78638363e-02,
9.90799950e-01, 7.78638363e-02, 1.00000000e+00, 3.00000000e+00,
2.01500000e+03],
[7.95533832e-01, 9.25097947e-02, 8.15246251e-01, 9.25097947e-02,
9.98151976e-01, 9.25097947e-02, 2.00000000e+00, 4.00000000e+00,
2.01500000e+03],
[7.97006720e-01, 1.09847565e-01, 8.16755635e-01, 1.09847565e-01,
1.00000000e+00, 1.09847565e-01, 2.00000000e+00, 5.00000000e+00,
2.01500000e+03],
[7.94528301e-01, 1.28832231e-01, 8.14215803e-01, 1.28832231e-01,
9.96890340e-01, 1.28832231e-01, 2.00000000e+00, 6.00000000e+00,
2.01500000e+03]])
这对我很有用:
!pip install shap==0.19.2
能否添加一个数据模型来制作一个完全可复制的示例?另外,详细说明shap.summary\u plot
中的X
是什么,以及为什么它与test\u X
不同。如果我添加随机数据,您的代码片段在xgboost 0.71
和shap 0.19.2
@mykhailolisoy下运行良好我很抱歉响应太晚。所有数字的格式与我上面发布的示例完全相同。希望这有帮助!它与np.random.seed(312)一起工作吗;列_X=np.random.random((10000,10));序列y=np.random.randn((10000));测试_X=np.随机.随机((1000,10));test_y=np.random.randn((1000))
如果没有,请检查软件包版本。这对我来说也很有效,我将你的示例数据拆分为15:4=train:testsamples我实际上在使用Google Colab来完成所有这些。我在代码的开头运行了“!pip install shap”。我的shap版本是:shap-0.28.3。我的XgBoost版本是:0.7.post4。我也运行了您上一个答案中的最后两个单元格的代码,或者出于某种原因,shap没有出现,但是xgboost与您的输出相同。Do!pip install shap==0.19.2
,正如我现在在链接的示例顶部添加的,请详细说明这是如何解决问题的。