Python 一维数据的带Logits的TensorFlow S形交叉熵上下文_Python_Tensorflow_Machine Learning_Computer Vision_Semantic Segmentation

Python 一维数据的带Logits的TensorFlow S形交叉熵上下文

python tensorflow machine-learning computer-vision

Python 一维数据的带Logits的TensorFlow S形交叉熵上下文,python,tensorflow,machine-learning,computer-vision,semantic-segmentation,Python,Tensorflow,Machine Learning,Computer Vision,Semantic Segmentation,假设我们有一些1D数据（例如时间序列），其中所有序列都有固定长度l：我们想用n个类执行语义分割：然后，单个示例的输出具有shape[n，l]（即数据格式不是“channels\u last”），并且批处理的输出具有shape[b，n，l]，其中b是批处理中的示例数这些类是独立的，因此我的理解是，使用sigmoid交叉熵作为损失而不是softmax交叉熵在这里是适用的问题: 关于tf.nn.sigmoid\u cross\u entropy\u with\u logits的预期格式和使用

假设我们有一些1D数据（例如时间序列），其中所有序列都有固定长度l：

我们想用n个类执行语义分割：

然后，单个示例的输出具有shape

[n，l]

（即

数据格式不是“channels\u last”
），并且批处理的输出具有shape[b，n，l]
，其中b
是批处理中的示例数
这些类是独立的，因此我的理解是，使用sigmoid交叉熵作为损失而不是softmax交叉熵在这里是适用的

问题:
关于tf.nn.sigmoid\u cross\u entropy\u with\u logits的预期格式和使用，我有几个相关的小问题：
由于网络输出的张量与成批标签的形状相同，我应该在假设网络输出Logit的情况下对网络进行训练，还是采用keras方法（参见keras的二进制交叉熵
）并假设它输出概率
考虑到1d分割问题，我是否应该调用tf.nn.sigmoid\u cross\u entropy\u，并启用\u logits
：

data\u format='channels\u first'
（如上所示），或
data\u format='channels\u last'
（示例.T）

如果我想每个频道单独分配标签
传递给优化器的丢失操作应为：

tf.nn.sigmoid\u cross\u entropy\u与logits（标签、logits）
tf.reduce\u mean（tf.nn.sigmoid\u cross\u entropy\u with\u logits（标签、logits））
，或
tf.loss.sigmoid\u cross\u entropy


代码
这突出了我的困惑，并证明了数据\u格式
实际上很重要…，但文档没有明确说明预期的格式
虚拟数据
tf.缩减平均值（tf.nn）
tf.损失
测试等效性
数据格式等效性
tf.reduce\u mean（tf.nn.sigmoid\u cross\u entropy\u with\u logits（…）
和tf.loss.sigmoid\u cross\u entropy（…）
（带有默认参数）都在计算相同的东西。问题在于测试中使用==
比较两个浮点数。相反，使用方法检查两个浮点数是否相等：
# loss _should_(?) be the same for 'channels_first' and 'channels_last' data_format
# test example_1
e1 = np.isclose(l1, t_l1.T).all()
# test example 2
e2 = np.isclose(l2, t_l2.T).all()

# loss calculated for each example and then batched together should be the same 
# as the loss calculated on the batched examples
ea = np.isclose(np.array([l1, l2]), bl).all()
t_ea = np.isclose(np.array([t_l1, t_l2]), t_bl).all()

# loss calculated on the batched examples for 'channels_first' should be the same
# as loss calculated on the batched examples for 'channels_last'
eb = np.isclose(bl, np.transpose(t_bl, (0, 2, 1))).all()


e1, e2, ea, t_ea, eb
# (True, True, True, True, True)

以及：
我想可能会对你有所帮助。@今天我之前读过这个答案，但我仍然不太清楚，因为独立性的维度没有得到明确的证明，而且我在Colab中的结果与该答案所暗示的不同，这是有道理的。。。那么，如果每个类都应该是独立的，那么为什么数据格式不重要呢？还是每个类中的每个项都是独立的？@sumneuro分别计算每个元素的sigomoid和交叉熵损失（这是因为您的假设：每个元素可能属于多个类，因此类是独立的）。因此，data\u格式
在这里并不重要，只是澄清一下，在sigmoid\u交叉熵损失之前是否应该有激活函数？或者图形应该有两个输出节点？一个是用sigmoid交叉熵计算损失的，另一个是返回输出层的sigmoid？@sumneurn都tf.nn.sigmoid\u交叉熵\u和tf.loss.sigmoid\u交叉熵首先应用sigmoid（这就是为什么他们假定logits为输入），然后计算交叉熵损失。因此，您不应该单独应用乙状结肠。好吧，现在我有点困惑（对此感到抱歉）。我在读马克西姆的答案，他说网络的输出被认为是“逻辑”（而凯拉斯默认假设概率）。如果是这种情况，那么我将输出层传递给sigmoid C.E.以计算损失，但我能否在返回概率的图形中添加另一个输出节点？
          # [ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11]    index            
labeled = [
            [ 0,  1,  1,  0,  0,  0,  0,  0,  0,  0,  0,  0], # class 1
            [ 0,  0,  0,  0,  1,  1,  1,  1,  0,  0,  0,  0], # class 2
            [ 0,  0,  0,  0,  0,  0,  0,  1,  1,  1,  0,  0], # class 3
           #[                     ...                      ],
            [ 1,  1,  1,  0,  0,  0,  0,  0,  1,  1,  1,  1], # class n
 ]

c = 5  # number of channels (label classes)
p = 10 # number of positions ('pixels')


# data_format = 'channels_first', shape = [classes, pixels]
# 'logits' for 2 examples
pred_1 = np.array([[random.random() for v in range(p)]for n in range(c)]).astype(float)
pred_2 = np.array([[random.random() for v in range(p)]for n in range(c)]).astype(float)

# 'ground truth' for the above 2 examples
targ_1 = np.array([[0 if random.random() < 0.8 else 1 for v in range(p)]for n in range(c)]).astype(float)
targ_2 = np.array([[0 if random.random() < 0.8 else 1 for v in range(p)]for n in range(c)]).astype(float)

# batched form of the above examples
preds = np.array([pred_1, pred_2])
targs = np.array([targ_1, targ_2])


# data_format = 'channels_last', shape = [pixels, classes]
t_pred_1 = pred_1.T
t_pred_2 = pred_2.T
t_targ_1 = targ_1.T
t_targ_2 = targ_2.T

t_preds = np.array([t_pred_1, t_pred_2])
t_targs = np.array([t_targ_1, t_targ_2])

# calculate individual losses for 'channels_first'
loss_1 = tf.nn.sigmoid_cross_entropy_with_logits(labels=targ_1, logits=pred_1)
loss_2 = tf.nn.sigmoid_cross_entropy_with_logits(labels=targ_2, logits=pred_2)
# calculate batch loss for 'channels_first'
b_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=targs, logits=preds)

# calculate individual losses for 'channels_last'
t_loss_1 = tf.nn.sigmoid_cross_entropy_with_logits(labels=t_targ_1, logits=t_pred_1)
t_loss_2 = tf.nn.sigmoid_cross_entropy_with_logits(labels=t_targ_2, logits=t_pred_2)
# calculate batch loss for 'channels_last'
t_b_loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=t_targs, logits=t_preds)
# get actual tensors
with tf.Session() as sess:
  # loss for 'channels_first'
  l1   = sess.run(loss_1)
  l2   = sess.run(loss_2)
  # batch loss for 'channels_first'
  bl   = sess.run(b_loss)

  # loss for 'channels_last'
  t_l1 = sess.run(t_loss_1)
  t_l2 = sess.run(t_loss_2)

  # batch loss for 'channels_last'
  t_bl = sess.run(t_b_loss)

# calculate individual losses for 'channels_first'
rm_loss_1 = tf.reduce_mean(loss_1)
rm_loss_2 = tf.reduce_mean(loss_2)
# calculate batch loss for 'channels_first'
rm_b_loss = tf.reduce_mean(b_loss)

# calculate individual losses for 'channels_last'
rm_t_loss_1 = tf.reduce_mean(t_loss_1)
rm_t_loss_2 = tf.reduce_mean(t_loss_2)
# calculate batch loss for 'channels_last'
rm_t_b_loss = tf.reduce_mean(t_b_loss)
# get actual tensors
with tf.Session() as sess:
  # loss for 'channels_first'
  rm_l1   = sess.run(rm_loss_1)
  rm_l2   = sess.run(rm_loss_2)
  # batch loss for 'channels_first'
  rm_bl   = sess.run(rm_b_loss)

  # loss for 'channels_last'
  rm_t_l1 = sess.run(rm_t_loss_1)
  rm_t_l2 = sess.run(rm_t_loss_2)

  # batch loss for 'channels_last'
  rm_t_bl = sess.run(rm_t_b_loss)

# calculate individual losses for 'channels_first'
tf_loss_1 = tf.losses.sigmoid_cross_entropy(multi_class_labels=targ_1, logits=pred_1)
tf_loss_2 = tf.losses.sigmoid_cross_entropy(multi_class_labels=targ_2, logits=pred_2)
# calculate batch loss for 'channels_first'
tf_b_loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=targs, logits=preds)

# calculate individual losses for 'channels_last'
tf_t_loss_1 = tf.losses.sigmoid_cross_entropy(multi_class_labels=t_targ_1, logits=t_pred_1)
tf_t_loss_2 = tf.losses.sigmoid_cross_entropy(multi_class_labels=t_targ_2, logits=t_pred_2)
# calculate batch loss for 'channels_last'
tf_t_b_loss = tf.losses.sigmoid_cross_entropy(multi_class_labels=t_targs, logits=t_preds)
# get actual tensors
with tf.Session() as sess:
  # loss for 'channels_first'
  tf_l1   = sess.run(tf_loss_1)
  tf_l2   = sess.run(tf_loss_2)
  # batch loss for 'channels_first'
  tf_bl   = sess.run(tf_b_loss)

  # loss for 'channels_last'
  tf_t_l1 = sess.run(tf_t_loss_1)
  tf_t_l2 = sess.run(tf_t_loss_2)

  # batch loss for 'channels_last'
  tf_t_bl = sess.run(tf_t_b_loss)

# loss _should_(?) be the same for 'channels_first' and 'channels_last' data_format
# test example_1
e1 = (l1 == t_l1.T).all()
# test example 2
e2 = (l2 == t_l2.T).all()

# loss calculated for each example and then batched together should be the same 
# as the loss calculated on the batched examples
ea = (np.array([l1, l2]) == bl).all()
t_ea = (np.array([t_l1, t_l2]) == t_bl).all()

# loss calculated on the batched examples for 'channels_first' should be the same
# as loss calculated on the batched examples for 'channels_last'
eb = (bl == np.transpose(t_bl, (0, 2, 1))).all()


e1, e2, ea, t_ea, eb
# (True, False, False, False, True) <- changes every time, so True is happenstance

l_e1 = tf_l1 == rm_l1
l_e2 = tf_l2 == rm_l2
l_eb = tf_bl == rm_bl

l_t_e1 = tf_t_l1 == rm_t_l1
l_t_e2 = tf_t_l2 == rm_t_l2
l_t_eb = tf_t_bl == rm_t_bl

l_e1, l_e2, l_eb, l_t_e1, l_t_e2, l_t_eb
# (False, False, False, False, False, False)

# loss _should_(?) be the same for 'channels_first' and 'channels_last' data_format
# test example_1
e1 = np.isclose(l1, t_l1.T).all()
# test example 2
e2 = np.isclose(l2, t_l2.T).all()

# loss calculated for each example and then batched together should be the same 
# as the loss calculated on the batched examples
ea = np.isclose(np.array([l1, l2]), bl).all()
t_ea = np.isclose(np.array([t_l1, t_l2]), t_bl).all()

# loss calculated on the batched examples for 'channels_first' should be the same
# as loss calculated on the batched examples for 'channels_last'
eb = np.isclose(bl, np.transpose(t_bl, (0, 2, 1))).all()


e1, e2, ea, t_ea, eb
# (True, True, True, True, True)

l_e1 = np.isclose(tf_l1, rm_l1)
l_e2 = np.isclose(tf_l2, rm_l2)
l_eb = np.isclose(tf_bl, rm_bl)

l_t_e1 = np.isclose(tf_t_l1, rm_t_l1)
l_t_e2 = np.isclose(tf_t_l2, rm_t_l2)
l_t_eb = np.isclose(tf_t_bl, rm_t_bl)

l_e1, l_e2, l_eb, l_t_e1, l_t_e2, l_t_eb
# (True, True, True, True, True, True)