Machine learning 二元分类的Tensorflow单sigmoid输出和两个线性输出（具有稀疏softmax交叉熵损失）_Machine Learning_Tensorflow_Classification

Machine learning 二元分类的Tensorflow单sigmoid输出和两个线性输出（具有稀疏softmax交叉熵损失）

machine-learning tensorflow

Machine learning 二元分类的Tensorflow单sigmoid输出和两个线性输出（具有稀疏softmax交叉熵损失）,machine-learning,tensorflow,classification,Machine Learning,Tensorflow,Classification,我正在TensorFlow中试验二进制分类器的实现。如果我在最后一层有两个普通输出（即无激活），并使用tf.loss.sparse\u softmax\u cross\u entropy，我的网络将按预期训练。但是，如果我更改输出层以生成带有tf.sigmoid激活的单个输出，并使用tf.loss.log_loss作为损耗函数，则我的网络不会训练（即损耗/精度不会提高）以下是我的输出层/损耗函数在第一种（即工作）情况下的样子： out=tf.layers.density（上一页，2）损耗=t

我正在TensorFlow中试验二进制分类器的实现。如果我在最后一层有两个普通输出（即无激活），并使用

tf.loss.sparse\u softmax\u cross\u entropy

，我的网络将按预期训练。但是，如果我更改输出层以生成带有

tf.sigmoid

激活的单个输出，并使用

tf.loss.log_loss

作为损耗函数，则我的网络不会训练（即损耗/精度不会提高）

以下是我的输出层/损耗函数在第一种（即工作）情况下的样子：

out=tf.layers.density（上一页，2）
损耗=tf.loss.sparse\u softmax\u cross\u熵（labels=y，logits=out）

在第二种情况下，我有以下几点：

out=tf.layers.density（上一个，1，激活=tf.sigmoid）
损失=tf.loss.log\u损失（标签=y，预测=out）

张量

是

值的向量；它不是一个热编码的。在第一种情况下，网络按照预期进行学习，但在第二种情况下则没有。除了这两条线外，其他一切都保持不变

我不明白为什么第二个设置不起作用。有趣的是，如果我用Keras表示相同的网络，并使用第二种设置，它就会工作。在第二种情况下，我是否使用了错误的TensorFlow函数来表达我的意图？我想生成一个sigmoid输出，并使用二进制交叉熵损失来训练一个简单的二进制分类器

我正在使用Python 3.6和TensorFlow 1.4

是一个小的、可运行的Python脚本，用于演示该问题。请注意，您需要从Kaggle下载StatOil/C-CORE数据集才能按原样运行脚本

谢谢

在两个输出上使用

sigmoid

激活不会给出概率分布：

import tensorflow as tf
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()

start = tf.constant([[4., 5.]])
out_dense = tf.layers.dense(start, units=2)
print("Logits (un-transformed)", out_dense)
out_sigmoid = tf.layers.dense(start, units=2, activation=tf.sigmoid)
print("Elementwise sigmoid", out_sigmoid)
out_softmax = tf.nn.softmax(tf.layers.dense(start, units=2))
print("Softmax (probability distribution)", out_softmax)

印刷品：

Logits (un-transformed) tf.Tensor([[-3.64021587  6.90115976]], shape=(1, 2), dtype=float32)
Elementwise sigmoid tf.Tensor([[ 0.94315267  0.99705648]], shape=(1, 2), dtype=float32)
Softmax (probability distribution) tf.Tensor([[ 0.05623185  0.9437682 ]], shape=(1, 2), dtype=float32)

除了

tf.nn.softmax

，您还可以在单个logit上使用

tf.sigmoid

，然后将另一个输出设置为1减去该值。

我没有在两个输出上使用

tf.sigmoid

。我在单个输出上使用它，并使用

tf.loss.log\u loss

计算损耗。哈，没错。因此

tf.loss.log_loss

需要一个热编码向量，但这与标量分布情况下的稀疏标记相同。但问题是什么

tf.loss.log\u loss（labels=[[label]]，predictions=[[prediction\u scalar]]）

相当于

sparse\u softmax\u cross\u entropy（labels=[[label]]，logits=[1.-prediction\u scalar，prediction\u scalar]]）

在[0，1]中添加

标签。我也遇到了同样的问题。Sigmoid和log_损失在keras中有效，但在tensorflow中没有学习。然而，稀疏的最大交叉熵似乎是可行的。你知道怎么回事了吗？不幸的是没有。我仍然有兴趣了解我们为什么看到这个问题。