Python 度量F1警告零除法_Python_Machine Learning_Classification_Metrics

Python 度量F1警告零除法

python machine-learning

Python 度量F1警告零除法,python,machine-learning,classification,metrics,Python,Machine Learning,Classification,Metrics,我想计算我的模型的F1分数。但是我收到了警告，F1得了0.0分，我不知道该怎么办以下是源代码： def model_evaluation(dict): for key,value in dict.items(): classifier = Pipeline([('tfidf', TfidfVectorizer()), ('clf', value), ]) classifier.fit(X_tr

我想计算我的模型的F1分数。但是我收到了警告，F1得了0.0分，我不知道该怎么办

以下是源代码：

def model_evaluation(dict):

    for key,value in dict.items():

        classifier = Pipeline([('tfidf', TfidfVectorizer()),
                         ('clf', value),
    ])
        classifier.fit(X_train, y_train)
        predictions = classifier.predict(X_test)
        print("Accuracy Score of" , key ,  ": ", metrics.accuracy_score(y_test,predictions))
        print(metrics.classification_report(y_test,predictions))
        print(metrics.f1_score(y_test, predictions, average="weighted", labels=np.unique(predictions), zero_division=0))
        print("---------------","\n")


dlist =  { "KNeighborsClassifier": KNeighborsClassifier(3),"LinearSVC":
    LinearSVC(), "MultinomialNB": MultinomialNB(), "RandomForest": RandomForestClassifier(max_depth=5, n_estimators=100)}

model_evaluation(dlist)

结果如下：

Accuracy Score of KNeighborsClassifier :  0.75
              precision    recall  f1-score   support

not positive       0.71      0.77      0.74        13
    positive       0.79      0.73      0.76        15

    accuracy                           0.75        28
   macro avg       0.75      0.75      0.75        28
weighted avg       0.75      0.75      0.75        28

0.7503192848020434
--------------- 

Accuracy Score of LinearSVC :  0.8928571428571429
              precision    recall  f1-score   support

not positive       1.00      0.77      0.87        13
    positive       0.83      1.00      0.91        15

    accuracy                           0.89        28
   macro avg       0.92      0.88      0.89        28
weighted avg       0.91      0.89      0.89        28

0.8907396950875212
--------------- 

Accuracy Score of MultinomialNB :  0.5357142857142857
              precision    recall  f1-score   support

not positive       0.00      0.00      0.00        13
    positive       0.54      1.00      0.70        15

    accuracy                           0.54        28
   macro avg       0.27      0.50      0.35        28
weighted avg       0.29      0.54      0.37        28

0.6976744186046512
--------------- 

C:\Users\Cey\anaconda3\lib\site-packages\sklearn\metrics\_classification.py:1272: UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples. Use `zero_division` parameter to control this behavior.
  _warn_prf(average, modifier, msg_start, len(result))
Accuracy Score of RandomForest :  0.5714285714285714
              precision    recall  f1-score   support

not positive       1.00      0.08      0.14        13
    positive       0.56      1.00      0.71        15

    accuracy                           0.57        28
   macro avg       0.78      0.54      0.43        28
weighted avg       0.76      0.57      0.45        28

0.44897959183673475
---------------

谁能告诉我该怎么办？我仅在使用“多项式nb（）”分类器时收到此消息

第二：

使用高斯分类器（GaussianNB（））扩展字典时，我收到以下错误消息：

TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array.

我在这里该怎么办

谁能告诉我该怎么办？我仅在使用“多项式nb（）”分类器时收到此消息

第一个错误似乎表明在使用

多项式nb

时未预测特定标签，这会导致未定义的

f分数

，或定义不清，因为缺少的值被设置为

。这是可以解释的

使用高斯分类器（GaussianNB（））扩展字典时，我收到以下错误消息： TypeError：传递了稀疏矩阵，但需要密集数据。使用X.toarray（）转换为密集numpy数组

根据这个问题，错误是非常明显的，问题是

TfidfVectorizer

返回一个

sparse

矩阵，该矩阵不能用作

GaussianNB

的输入。因此，在我看来，要么避免使用

GaussianNB

，要么添加一个中间转换器将稀疏阵列变为密集阵列，我不建议这是

tf idf

矢量化的结果。

请仔细阅读警告消息，它确切说明了问题所在（你有一些没有预料到的标签）。答案有用吗？别忘了你可以投票并接受答案。看，谢谢！