Python 损失函数中的一种热编码
我试图在我的损失函数中对预测进行一个热编码Python 损失函数中的一种热编码,python,tensorflow,machine-learning,deep-learning,tensor,Python,Tensorflow,Machine Learning,Deep Learning,Tensor,我试图在我的损失函数中对预测进行一个热编码 def loss(y_true, y_pred, smooth=1e-7): y_true = K.flatten(y_true) y_true = one_hot(y_true, n_classes) y_pred = softargmax(y_pred) y_pred = K.flatten(y_pred) y_pred = one_hot(y_pred, n_classes) inters
def loss(y_true, y_pred, smooth=1e-7):
y_true = K.flatten(y_true)
y_true = one_hot(y_true, n_classes)
y_pred = softargmax(y_pred)
y_pred = K.flatten(y_pred)
y_pred = one_hot(y_pred, n_classes)
intersect = K.sum(y_true * y_pred, axis=-1)
denom = K.sum(y_true + y_pred, axis=-1)
return K.mean((2. * intersect / (denom + smooth)))
但是将y\u pred
强制转换为int32
以使用内置的K.one\u hot
会导致
ValueError: No gradients provided for any variable:
错误。因此,我编写了自己的one_hot编码方法,避免将y_pred
转换为int32
def one_hot(xs, n_classes):
table = tf.eye(n_classes, dtype=tf.dtypes.float32)
return tf.map_fn(lambda x: table[tf.raw_ops.Cast(x=x, DstT=tf.int32)], xs)
one_hot(tf.constant([0.0, 1.0, 2.0]), 3)
我的问题如下。使用
tf.gather/gatner\u nd
会导致相同的梯度误差。我能找到的唯一一个不会导致梯度误差的函数是tf.map\u fn
,它非常慢,再次切换到矢量化的\u map
会导致梯度误差。有没有另一种方法可以实现一个具有渐变的热编码?您可以通过将最大logit设置为1.0并屏蔽,创建一个数值稳定的one\u hot
import tensorflow as tf
def stable_one_hot(vec):
"""
Args:
vec: tf.Tensor, a batch of logits to be encoded
Returns:
tf.Tensor, a batch of numerically stable one-hot encoded logits
"""
m = tf.math.reduce_max(vec, axis=1, keepdims=True)
e = tf.math.exp(vec - m)
mask = tf.cast(tf.math.not_equal(e, 1.0), tf.float32)
vec -= 1e9 * mask
return tf.nn.softmax(vec, axis=1)
# dummy data w/batch of size 32
X = tf.random.normal([32, 100])
# dummy labels w/10 possibilities
y = tf.random.uniform(shape=[32], minval=0, maxval=10, dtype=tf.int32)
# one-hot them
y_true = tf.one_hot(y, 10)
# simple network
nn = tf.keras.layers.Dense(10)
# forward pass
with tf.GradientTape() as tape:
y_pred = nn(X)
y_pred = stable_one_hot(y_pred)
intersect = tf.math.reduce_sum(y_true * y_pred, -1)
denom = tf.math.reduce_sum(y_true + y_pred, -1)
loss = 2.0 * intersect / (denom + 1e-7)
loss = tf.math.reduce_mean(loss)
grads = tape.gradient(loss, nn.trainable_variables)
assert grads != [None, None]
print(f"loss: {loss.numpy():.4f}")
# loss: 0.1250
为什么不直接使用?@gobrewers14正如我在问题中所写的那样,要使一个“热”工作,必须将“pred”强制转换为int32操作没有梯度,因此不起作用