Python 如何计算Scikit中多类分类的混淆矩阵？_Python_Scikit Learn_Classification_Confusion Matrix

Python 如何计算Scikit中多类分类的混淆矩阵？

python scikit-learn

Python 如何计算Scikit中多类分类的混淆矩阵？,python,scikit-learn,classification,confusion-matrix,Python,Scikit Learn,Classification,Confusion Matrix,我有一个多类分类任务。当我基于运行脚本时，如下所示： classifier = OneVsRestClassifier(GradientBoostingClassifier(n_estimators=70, max_depth=3, learning_rate=.02)) y_pred = classifier.fit(X_train, y_train).predict(X_test) cnf_matrix = confusion_matrix(y_test, y_pred) 我得到这个错误

我有一个多类分类任务。当我基于运行脚本时，如下所示：

classifier = OneVsRestClassifier(GradientBoostingClassifier(n_estimators=70, max_depth=3, learning_rate=.02))

y_pred = classifier.fit(X_train, y_train).predict(X_test)
cnf_matrix = confusion_matrix(y_test, y_pred)

我得到这个错误：

File "C:\ProgramData\Anaconda2\lib\site-packages\sklearn\metrics\classification.py", line 242, in confusion_matrix
    raise ValueError("%s is not supported" % y_type)
ValueError: multilabel-indicator is not supported

我试图将

labels=classifier.classes\uuu

传递给

混淆矩阵（）
y_测试和y_pred如下所示：
y_test =
array([[0, 0, 0, 1, 0, 0],
   [0, 0, 0, 0, 1, 0],
   [0, 1, 0, 0, 0, 0],
   ..., 
   [0, 0, 0, 0, 0, 1],
   [0, 0, 0, 1, 0, 0],
   [0, 0, 0, 0, 1, 0]])


y_pred = 
array([[0, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0, 0],
   [0, 0, 0, 0, 0, 0],
   ..., 
   [0, 0, 0, 0, 0, 1],
   [0, 0, 0, 0, 0, 1],
   [0, 0, 0, 0, 0, 0]])

首先，需要创建标签输出数组。
假设你有3个类：'猫'，'狗'，'房子'索引：0,1,2。
对2个样本的预测结果为：狗、房子。
您的输出将是：
y_pred = [[0, 1, 0],[0, 0, 1]]

运行y_pred.argmax（1）以获取：[1,2]
此数组表示原始标签索引，表示：
[‘狗’、‘房子’]
num_classes = 3

# from lable to categorial
y_prediction = np.array([1,2]) 
y_categorial = np_utils.to_categorical(y_prediction, num_classes)

# from categorial to lable indexing
y_pred = y_categorial.argmax(1)

这对我很有用：
y_test_non_category = [ np.argmax(t) for t in y_test ]
y_predict_non_category = [ np.argmax(t) for t in y_predict ]

from sklearn.metrics import confusion_matrix
conf_mat = confusion_matrix(y_test_non_category, y_predict_non_category)

其中y\u test
和y\u predict
是分类变量，就像一个热向量。
我只是从预测y\u pred
矩阵中减去输出y\u test
矩阵，同时保持分类格式。对于-1
，我假设为假阴性，而对于1
，则假设为假阳性
下一步：
if output_matrix[i,j] == 1 and predictions_matrix[i,j] == 1:  
    produced_matrix[i,j] = 2 

以以下符号结束：

-1:假阴性
1：假阳性
0:真负数
2：真阳性

最后，执行一些简单的计数，您可以生成任何混淆度量。
为什么将y\u pred
和y\u test
作为一个热编码数组？你的原始类标签是什么？您应该从如何将y
@VivekKumar I binarizedy\u train
和y\u test
转换为y\u test=label\u binarized（y\u test，class=[0,1,2,3,4,5]）
forOneVsRestClassifier（）
。您应该将原始类（非二值化）进入混乱矩阵
。您需要反向转换您的y_pred
，才能从中获得原始类。@VivekKumar谢谢。我使用了非二值化版本并解决了这个问题。