Tensorflow 自定义keras损失函数二进制交叉熵给出不正确的结果_Tensorflow_Keras_Backend_Loss

Tensorflow 自定义keras损失函数二进制交叉熵给出不正确的结果

tensorflow keras

Tensorflow 自定义keras损失函数二进制交叉熵给出不正确的结果,tensorflow,keras,backend,loss,Tensorflow,Keras,Backend,Loss,有没有人有一个令人信服的解决方案来让自定义二进制交叉熵工作我尝试了所有可能的方法（甚至使整个训练数据大小与bacth大小相同，以消除批处理过程中对全局平均值的依赖）。但是我看到了我的二进制交叉熵实现和keras的实现之间的显著差异（通过指定loss='binary\u crossentropy'）我的CRASTOM二进制交叉熵代码如下 def _loss_tensor(y_true, y_pred): y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILO

有没有人有一个令人信服的解决方案来让自定义二进制交叉熵工作

我尝试了所有可能的方法（甚至使整个训练数据大小与bacth大小相同，以消除批处理过程中对全局平均值的依赖）。但是我看到了我的二进制交叉熵实现和keras的实现之间的显著差异（通过指定loss='binary\u crossentropy'）

我的CRASTOM二进制交叉熵代码如下

def _loss_tensor(y_true, y_pred):
y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILON)
out = (y_true * K.log(y_pred) + (1.0 - y_true) * K.log(1.0 - y_pred))
return -K.mean(out)
def _loss_tensor2(y_true, y_pred):
y_pred = K.clip(y_pred, _EPSILON, 1.0-_EPSILON)
out = -(y_true * K.log(y_pred) + -(1.0 - y_true) * K.log(1.0 - y_pred))
return out
def _loss_tensor2(y_true, y_pred):
loss1 = K.binary_crossentropy(y_true, y_pred)
return loss1

这些方法都不起作用。即使我在从自定义损失函数返回结果之前执行K.mean（），它也不起作用

我无法理解使用loss='binary\u crossentropy'的特殊功能是什么。当我使用我的自定义损失函数时，训练很糟糕，它确实按预期工作

我需要我的自定义损失函数来根据错误操作损失函数，并进一步惩罚某种类型的分类错误。

我找到了一种满足此要求的方法，并在此处发布了相同的方法：

然而，为什么内置函数的性能与显式公式方法明显不同尚不清楚。然而，我认为这主要是由于处理了y_pred概率值的上下限。

我找到了一种满足此要求的方法，并在此处发布了相同的内容：

def custom_binary_loss(y_true, y_pred): 
    # https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/backend.py#L4826
    y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
    
    term_0 = (1 - y_true) * K.log(1 - y_pred + K.epsilon())  # Cancels out when target is 1 
    term_1 = y_true * K.log(y_pred + K.epsilon()) # Cancels out when target is 0

    return -K.mean(term_0 + term_1, axis=1)

然而，为什么内置函数的性能与显式公式方法明显不同尚不清楚。然而，我认为这主要是由于对y_pred概率值上下限的处理

def custom_binary_loss(y_true, y_pred): 
    # https://github.com/tensorflow/tensorflow/blob/v2.3.1/tensorflow/python/keras/backend.py#L4826
    y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
    
    term_0 = (1 - y_true) * K.log(1 - y_pred + K.epsilon())  # Cancels out when target is 1 
    term_1 = y_true * K.log(y_pred + K.epsilon()) # Cancels out when target is 0

    return -K.mean(term_0 + term_1, axis=1)