Scikit learn 从sklearn随机森林分类器中获取预测概率误差_Scikit Learn_Classification_Random Forest

Scikit learn 从sklearn随机森林分类器中获取预测概率误差

scikit-learn

Scikit learn 从sklearn随机森林分类器中获取预测概率误差,scikit-learn,classification,random-forest,Scikit Learn,Classification,Random Forest,学习：输入样本的预测类概率计算为森林中树木的平均预测类概率我的问题是：有没有办法提取每个预测概率的均方误差例如，一个人应该有来自每棵树的预测概率，但我找不到如何预测编辑：这是用于predict\u proba功能的sklearn代码： def predict_proba(self, X): """Predict class probabilities for X. The predicted class probabilities of an input sample

学习：

输入样本的预测类概率计算为森林中树木的平均预测类概率

我的问题是：有没有办法提取每个预测概率的均方误差

例如，一个人应该有来自每棵树的预测概率，但我找不到如何预测

编辑：

这是用于

predict\u proba

功能的sklearn代码：

def predict_proba(self, X):
    """Predict class probabilities for X.
    The predicted class probabilities of an input sample are computed as
    the mean predicted class probabilities of the trees in the forest. The
    class probability of a single tree is the fraction of samples of the same
    class in a leaf.
    Parameters
    ----------
    X : array-like or sparse matrix of shape = [n_samples, n_features]
        The input samples. Internally, its dtype will be converted to
        ``dtype=np.float32``. If a sparse matrix is provided, it will be
        converted into a sparse ``csr_matrix``.
    Returns
    -------
    p : array of shape = [n_samples, n_classes], or a list of n_outputs
        such arrays if n_outputs > 1.
        The class probabilities of the input samples. The order of the
        classes corresponds to that in the attribute `classes_`.
    """
    # Check data
    X = self._validate_X_predict(X)

    # Assign chunk of trees to jobs
    n_jobs, _, _ = _partition_estimators(self.n_estimators, self.n_jobs)

    # Parallel loop
    all_proba = Parallel(n_jobs=n_jobs, verbose=self.verbose,
                         backend="threading")(
        delayed(parallel_helper)(e, 'predict_proba', X,
                                  check_input=False)
        for e in self.estimators_)

    # Reduce
    proba = all_proba[0]

    if self.n_outputs_ == 1:
        for j in range(1, len(all_proba)):
            proba += all_proba[j]

        proba /= len(self.estimators_)

    else:
        for j in range(1, len(all_proba)):
            for k in range(self.n_outputs_):
                proba[k] += all_proba[j][k]

        for k in range(self.n_outputs_):
            proba[k] /= self.n_estimators

    return proba

所以我似乎可以使用

all_proba

数组轻松访问单树概率

要实施这个

如何为一个分类问题定义最小均方误差？森林中的每棵树计算一个概率，然后将平均值作为输出；简单地计算与平均值的偏差。如何定义分类问题的最小均方误差？森林中的每棵树计算一个概率，然后将平均值作为输出；只需计算与该平均值的偏差。