Python sklearn:从随机森林中获得预测分数?

Python sklearn:从随机森林中获得预测分数?,python,scikit-learn,classification,Python,Scikit Learn,Classification,我正在使用sklearn的随机森林分类器构建一个模型。当使用它进行预测时,我想知道是否有一种方法可以获得预测的确定性水平(即预测该类别的树的数量)?没有直接的方法可以做到这一点。你必须把每一棵树从森林中取出,做出(单棵树)预测,然后计算有多少棵树给出了与森林相同的答案 这是一个例子: import numpy as np from sklearn.ensemble import RandomForestClassifier #modelling data X=np.array([[1,2,3,

我正在使用
sklearn
的随机森林分类器构建一个模型。当使用它进行预测时,我想知道是否有一种方法可以获得预测的确定性水平(即预测该类别的树的数量)?

没有直接的方法可以做到这一点。你必须把每一棵树从森林中取出,做出(单棵树)预测,然后计算有多少棵树给出了与森林相同的答案

这是一个例子:

import numpy as np
from sklearn.ensemble import RandomForestClassifier 
#modelling data
X=np.array([[1,2,3,4],[1,3,1,2],[4,6,1,2], [3,3,4,3] , [1,1,2,1]  ])
#target variable
y=np.array([1,0,1,1,0])
#random_forest model
forest = RandomForestClassifier(n_estimators=10, random_state=1)
#fit forest model
forest = forest.fit( X, y )
#predict .
full_predictions=forest.predict( X )
print (full_predictions)
#[1 0 1 1 0]

#initialize a vector to hold counts of trees that gave the same class as in full_predictions. Has the same length as rows in the data
counts_of_same_predictions=[0 for i in range (len(y)) ]
#access each one of the trees and make a prediction and then count whether it was the same as the one with the Random Forest
i_tree = 0
for tree_in_forest in forest.estimators_:
    single_tree_predictions=tree_in_forest.predict(X)
    #check if predictions are the same with the global (forest's) predictions
    for j in range (len(single_tree_predictions)):
        if single_tree_predictions[j]==full_predictions[j]:
            #increment counts for that row
            counts_of_same_predictions[j]+=1

print('counts of same classifications', counts_of_same_predictions)
#counts of same classifications [6, 7, 8, 8, 8]

显然,
RandomForestClassifier
中有一个内置的方法:

forest.predict_proba(X)

关于预测概率的完整信息: