如何计算ADABoost模型的形状值?

如何计算ADABoost模型的形状值?,adaboost,shap,Adaboost,Shap,我正在运行3种不同的随机森林模型、梯度增强模型、Ada增强模型和基于这3种模型的模型集成 我设法将SHAP用于GB和RF,但未用于ADA,出现以下错误: Exception Traceback (most recent call last) in engine ----> 1 explainer = shap.TreeExplainer(model,data = explain_data.head(1000), model_o

我正在运行3种不同的随机森林模型、梯度增强模型、Ada增强模型和基于这3种模型的模型集成

我设法将SHAP用于GB和RF,但未用于ADA,出现以下错误:

Exception                                 Traceback (most recent call last)
in engine
----> 1 explainer = shap.TreeExplainer(model,data = explain_data.head(1000), model_output= 'probability')

/home/cdsw/.local/lib/python3.6/site-packages/shap/explainers/tree.py in __init__(self, model, data, model_output, feature_perturbation, **deprecated_options)
    110         self.feature_perturbation = feature_perturbation
    111         self.expected_value = None
--> 112         self.model = TreeEnsemble(model, self.data, self.data_missing)
    113 
    114         if feature_perturbation not in feature_perturbation_codes:

/home/cdsw/.local/lib/python3.6/site-packages/shap/explainers/tree.py in __init__(self, model, data, data_missing)
    752             self.tree_output = "probability"
    753         else:
--> 754             raise Exception("Model type not yet supported by TreeExplainer: " + str(type(model)))
    755 
    756         # build a dense numpy version of all the tree objects

Exception: Model type not yet supported by TreeExplainer: <class 'sklearn.ensemble._weight_boosting.AdaBoostClassifier'>
我在那个州的Git上找到了这个

TreeExplainer从我们试图解释的任何模型类型创建一个TreeSemble对象,然后在下游使用它。因此,您需要做的就是在

TreeSemble构造函数类似于梯度提升的构造函数


但我真的不知道如何实现它,因为我对它还很陌生。

我也遇到了同样的问题,我所做的就是修改“您正在评论”中的文件

在我的例子中,我使用windows,因此文件位于C:\Users\my\u user\AppData\Local\Continuum\anaconda3\Lib\site packages\shap\explainers中,但您可以双击错误消息,文件将被打开

下一步是添加另一个elif,正如git帮助的答案所说。在我的例子中,我是从404行开始做的,如下所示:

1修改源代码。 注意,在其他模型中,shap的代码需要AdaBoost分类器没有的直接属性“criteria”。所以在这种情况下,这个属性是从弱分类器中获得的,AdaBoost已经训练好了,这就是为什么我添加了model.base\u估计器\uu.criteria

最后,必须再次导入库,训练模型并获得形状值。我举一个例子:

2再次导入库并尝试: 这将生成以下内容:

3.获取您的新结果: shap包似乎已经更新,仍然不包含AdaBoostClassifier。根据前面的答案,我修改了前面的答案,以使用第598-610行中的shap/explainers/tree.py文件

### Added AdaBoostClassifier based on the outdated StackOverflow response and Github issue here
### https://stackoverflow.com/questions/60433389/how-to-calculate-shap-values-for-adaboost-model/61108156#61108156
### https://github.com/slundberg/shap/issues/335
elif safe_isinstance(model, ["sklearn.ensemble.AdaBoostClassifier", "sklearn.ensemble._weighted_boosting.AdaBoostClassifier"]):
    assert hasattr(model, "estimators_"), "Model has no `estimators_`! Have you called `model.fit`?"
    self.internal_dtype = model.estimators_[0].tree_.value.dtype.type
    self.input_dtype = np.float32
    scaling = 1.0 / len(model.estimators_) # output is average of trees
    self.trees = [Tree(e.tree_, normalize=True, scaling=scaling) for e in model.estimators_]
    self.objective = objective_name_map.get(model.base_estimator_.criterion, None) #This line is done to get the decision criteria, for example gini.
    self.tree_output = "probability" #This is the last line added

还在测试中把这个添加到包:

@ Stutelx如果答案是好的,请考虑接受它。
from sklearn import datasets
from sklearn.ensemble import AdaBoostClassifier
import shap

# import some data to play with
iris = datasets.load_iris()
X = iris.data
y = iris.target

ADABoost_model = AdaBoostClassifier()
ADABoost_model.fit(X, y)

shap_values = shap.TreeExplainer(ADABoost_model).shap_values(X)
shap.summary_plot(shap_values, X, plot_type="bar")
### Added AdaBoostClassifier based on the outdated StackOverflow response and Github issue here
### https://stackoverflow.com/questions/60433389/how-to-calculate-shap-values-for-adaboost-model/61108156#61108156
### https://github.com/slundberg/shap/issues/335
elif safe_isinstance(model, ["sklearn.ensemble.AdaBoostClassifier", "sklearn.ensemble._weighted_boosting.AdaBoostClassifier"]):
    assert hasattr(model, "estimators_"), "Model has no `estimators_`! Have you called `model.fit`?"
    self.internal_dtype = model.estimators_[0].tree_.value.dtype.type
    self.input_dtype = np.float32
    scaling = 1.0 / len(model.estimators_) # output is average of trees
    self.trees = [Tree(e.tree_, normalize=True, scaling=scaling) for e in model.estimators_]
    self.objective = objective_name_map.get(model.base_estimator_.criterion, None) #This line is done to get the decision criteria, for example gini.
    self.tree_output = "probability" #This is the last line added