Scikit learn 如何使用千层面计算F1微分数_Scikit Learn_Neural Network_Deep Learning_Lasagne_Nolearn

Scikit learn 如何使用千层面计算F1微分数

scikit-learn neural-network deep-learning

Scikit learn 如何使用千层面计算F1微分数,scikit-learn,neural-network,deep-learning,lasagne,nolearn,Scikit Learn,Neural Network,Deep Learning,Lasagne,Nolearn,我在网上找到了这段代码，并想对其进行测试。它确实起了作用，结果包括训练损失、测试损失、验证分数和时间等但是我怎样才能得到F1的微积分呢？此外，如果我尝试导入scikit，请在添加以下代码后学习计算F1： import theano.tensor as T import numpy as np from nolearn.lasagne import NeuralNet def multilabel_objective(predictions, targets): epsilon = n

我在网上找到了这段代码，并想对其进行测试。它确实起了作用，结果包括训练损失、测试损失、验证分数和时间等

但是我怎样才能得到F1的微积分呢？此外，如果我尝试导入scikit，请在添加以下代码后学习计算F1：

import theano.tensor as T
import numpy as np
from nolearn.lasagne import NeuralNet

def multilabel_objective(predictions, targets):
    epsilon = np.float32(1.0e-6)
    one = np.float32(1.0)
    pred = T.clip(predictions, epsilon, one - epsilon)
    return -T.sum(targets * T.log(pred) + (one - targets) * T.log(one - pred), axis=1)

net = NeuralNet(
    # your other parameters here (layers, update, max_epochs...)
    # here are the one you're interested in:
    objective_loss_function=multilabel_objective,
    custom_score=("validation score", lambda x, y: np.mean(np.abs(x - y)))
)

我得到了这个错误：

ValueError:无法处理多标签指示器和连续多输出

如何基于上述代码实现F1微观计算？

假设测试集上的真实标签是

y\u true

（形状：

（n\u样本，n\u类）

，仅由0和1组成），并且测试观察值是

X\u test

（形状：

（n\u样本，n\u特征）

）

然后通过

y\u test=net.predict（X\u test）

获得测试集上的净预测值

如果您正在进行多类分类：

由于在您的网络中，您已将

回归

设置为

假

，因此这也应仅由0和1组成

您可以使用以下公式计算微观平均f1分数：

data = data.astype(np.float32) 
classes = classes.astype(np.float32)

net.fit(data, classes)

score = cross_validation.cross_val_score(net, data, classes, scoring='f1', cv=10)

print score

小代码示例来说明这一点（使用虚拟数据，使用实际的

y\u测试

和

y\u真

）：

如果您正在进行多标签分类：

输出的不是0和1的矩阵，而是概率矩阵。y_pred[i，j]是观测i属于j类的概率

您需要定义一个阈值，超过该阈值，您将说观察值属于给定的类。然后，您可以相应地为标签添加属性，并按照与前一种情况相同的方式继续操作

from sklearn.metrics import f1_score
import numpy as np


y_true = np.array([[0, 0, 1], [0, 1, 0], [0, 0, 1], [0, 0, 1], [0, 1, 0]])
y_pred = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1], [0, 0, 1], [0, 0, 1]])

t = f1_score(y_true, y_pred, average='micro')

您的输出是连续的（概率介于0和1之间）还是二进制的（0或1，每行至少一个1）？请避免包含代码的屏幕截图，因为它们不可搜索。请参阅@M.Massias结果是二进制的（1或0），并且至少每低1。那么您是否尝试了我答案的第一部分？@M.Massias，我以前问过相关问题，您也帮过我，请检查一下：，我不知道如何设置预测值和真实值。是的，我测试了多标签分类部分，它成功了。我如何在此基础上实现交叉验证？我使用了一个示例代码：pred=net.predict（数据[1:50]），thresh=0.5，y_test_binary=np。其中（pred>thresh，1，0），t=f1_score（标签[1:50]，y_test_binary，average='micro'）t是您想要的分数，根据我的理解，我在计算t时没有实现交叉验证，对吗？或者f1_分数可以自己进行交叉验证？那么你的问题就不太合适了，因为除了代码中的一次，它没有提到交叉验证。您可以多次计算f_1分数，每次使用不同的训练和测试集。然后平均这些分数，我知道了，我会努力找出答案。非常感谢你。

from sklearn.metrics import f1_score
import numpy as np


y_true = np.array([[0, 0, 1], [0, 1, 0], [0, 0, 1], [0, 0, 1], [0, 1, 0]])
y_pred = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1], [0, 0, 1], [0, 0, 1]])

t = f1_score(y_true, y_pred, average='micro')

thresh = 0.8  # choose your own value 
y_test_binary = np.where(y_test > thresh, 1, 0) 
# creates an array with 1 where y_test>thresh, 0 elsewhere

f1_score(y_true, y_pred_binary, average='micro')