Python 高分类度量结果_Python_Machine Learning_Random Forest_Multiclass Classification

Python 高分类度量结果

python machine-learning

Python 高分类度量结果,python,machine-learning,random-forest,multiclass-classification,Python,Machine Learning,Random Forest,Multiclass Classification,我正在尝试使用机器学习识别作物类型。这是一个像素级分类。我有16个课程（目标），这是我的培训和测试数据集的形状： X_train, X_test, Y_train, Y_test=train_test_split(Features, Labels, test_size=0.25) X_train.shape, X_test.shape, Y_train.shape, Y_test.shape #((48330, 420), (16110, 420), (48330,), (16110,)) 我

我正在尝试使用机器学习识别作物类型。这是一个像素级分类。我有16个课程（目标），这是我的培训和测试数据集的形状：

X_train, X_test, Y_train, Y_test=train_test_split(Features, Labels, test_size=0.25)
X_train.shape, X_test.shape, Y_train.shape, Y_test.shape
#((48330, 420), (16110, 420), (48330,), (16110,))

我想首先用一个基线模型进行实验，所以我做了以下工作：

classifier=RandomForestClassifier()
classifier.fit(X_train, Y_train)
y_pred = classifier.predict(X_test)

print(confusion_matrix(Y_test,y_pred))
print(classification_report(Y_test,y_pred))
print(accuracy_score(Y_test, y_pred))

这是最终的结果：

我不知道这里发生了什么，为什么我有这么高的标准？

PS：我的数据集非常不平衡。

你可以查看你的训练和测试数据，很可能你的数据没有按照你想要的方式排列。

为什么不从分类器中较少的树开始，并将最大深度设置为2或3？这应该是一个很好的起点。如果它仍然做同样的事情，那么进一步简化模型。

您的数据集是不平衡的。首先尝试修复它，然后使用超参数调整。

请详细说明一下，好吗？您确定培训和测试数据集之间没有重叠吗？