Python Scikit Random forest pred_proba输出舍入值_Python_Machine Learning_Scikit Learn_Random Forest

Python Scikit Random forest pred_proba输出舍入值

python machine-learning scikit-learn

Python Scikit Random forest pred_proba输出舍入值,python,machine-learning,scikit-learn,random-forest,Python,Machine Learning,Scikit Learn,Random Forest,我在scikit learn中使用随机森林进行分类，为了获得类概率，我使用了pred_概率函数。但它输出四舍五入到小数点后第一位的概率我尝试使用示例iris数据集 iris = load_iris() df = pd.DataFrame(iris.data, columns=iris.feature_names) df['is_train'] = np.random.uniform(0, 1, len(df)) <= .75 df['species'] = pd.Categorical(

我在scikit learn中使用随机森林进行分类，为了获得类概率，我使用了pred_概率函数。但它输出四舍五入到小数点后第一位的概率

我尝试使用示例iris数据集

iris = load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['is_train'] = np.random.uniform(0, 1, len(df)) <= .75
df['species'] = pd.Categorical(iris.target, iris.target_names)
df.head()

train, test = df[df['is_train']==True], df[df['is_train']==False]

features = df.columns[:4]
clf = RandomForestClassifier(n_jobs=2)
y, _ = pd.factorize(train['species'])
clf.fit(train[features], y)
clf.predict_proba(train[features])

它是默认输出吗？可以增加小数位吗

注意： 找到了解决办法。

默认树数=10，将树数增加到100后，概率精度增加

默认设置为十棵树显然，您在代码中使用的是默认设置：

Parameters: 
n_estimators : integer, optional (default=10)
The number of trees in the forest.

尝试以下方法，将树的数量增加到25或大于10：

RandomForestClassifier(n_estimators=25, n_jobs=2)

如果你只是得到10个默认树的投票比例，这很可能会导致你看到的概率

您可能会遇到问题，因为iris数据集非常小。如果我记得正确的话，不到200次观察

predict.proba（）的文档内容如下：

我在文档中找不到任何参数来调整预测概率的小数精度

RandomForestClassifier(n_estimators=25, n_jobs=2)

The predicted class probabilities of an input sample is computed as the
mean predicted class probabilities of the trees in the forest. The class
probability of a single tree is the fraction of samples of the same 
class in a leaf.