Python 如何在scikit learn中从混淆矩阵返回一系列误报?
我正在scikit learn中构建一个二进制分类器,用于对文本评论进行分类。基本工作流程包括以下内容:Python 如何在scikit learn中从混淆矩阵返回一系列误报?,python,scikit-learn,Python,Scikit Learn,我正在scikit learn中构建一个二进制分类器,用于对文本评论进行分类。基本工作流程包括以下内容: #Splitting the data into training and testing sets. X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42) #Instantiate a model nb = MultinomialNB() #Train th
#Splitting the data into training and testing sets.
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.20, random_state=42)
#Instantiate a model
nb = MultinomialNB()
#Train the model.
nb.fit(X_train, y_train)
#Make predictions using the trained model
y_pred_class = nb.predict(X_test)
#View confusion matrix
confusion_matrix(y_test, y_pred_class)
#Output of confusion matrix
array([[295, 13],
[ 80, 70]])
X_test[y_test != y_pred_class]
根据混淆矩阵,有13个误报和80个误报
我想看到13条被归类为假阳性的文本评论。
我遵循这一点,看看我是否能得到一份被归类为假阳性的13个实体的列表
但是,当我运行以下命令时:
#Splitting the data into training and testing sets.
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.20, random_state=42)
#Instantiate a model
nb = MultinomialNB()
#Train the model.
nb.fit(X_train, y_train)
#Make predictions using the trained model
y_pred_class = nb.predict(X_test)
#View confusion matrix
confusion_matrix(y_test, y_pred_class)
#Output of confusion matrix
array([[295, 13],
[ 80, 70]])
X_test[y_test != y_pred_class]
我得到以下对象:
<458x758 sparse matrix of type '<class 'numpy.float64'>'
with 16890 stored elements in Compressed Sparse Row format>
对于误报,除了
y\u test!=y_pred_类
试试这个:
import numpy as np
false_positives = np.logical_and(y_test != y_pred_class, y_pred_class == 1)
X_test[false_positives]
当我运行您建议的代码时,我再次得到一个对象,而不是我期望的评论列表<上面的代码>@grantaguinaldo只是一个查找误报索引的示例。您是如何利用原始文本数据制作
X
?对该数据应用误报。您的问题解决了吗?