Python 3.x 如何修复';没有为任何变量提供梯度';在Tensorflow中使用ctc_损耗时出错

Python 3.x 如何修复';没有为任何变量提供梯度';在Tensorflow中使用ctc_损耗时出错,python-3.x,tensorflow,keras,deep-learning,tensorflow2.0,Python 3.x,Tensorflow,Keras,Deep Learning,Tensorflow2.0,我试图在Tensorflow 2.0.0alpha0中制作百度的Deep Speech 2模型。使用tf.GradientTape()对象计算梯度时,我在优化Tensorflowctc_loss时遇到问题 我目前正在向我的模型传递一个形状张量(批量大小、最大步长、专长),然后将计算出的logit传递给损失函数。我也尝试过传递稀疏张量,但这也不起作用 下面是创建我的模型的代码 import tensorflow as tf class DeepSpeech2(tf.keras.Model):

我试图在Tensorflow 2.0.0alpha0中制作百度的Deep Speech 2模型。使用
tf.GradientTape()
对象计算梯度时,我在优化Tensorflow
ctc_loss
时遇到问题

我目前正在向我的模型传递一个形状张量
(批量大小、最大步长、专长)
,然后将计算出的logit传递给损失函数。我也尝试过传递稀疏张量,但这也不起作用

下面是创建我的模型的代码

import tensorflow as tf


class DeepSpeech2(tf.keras.Model):

    def __init__(self, vocab_size, conv_filters=[11], conv_kernel_sizes=[1280], conv_strides=[2], 
                 recur_sizes=[100], rnn_type='gru', bidirect_rnn=False, batch_norm=True, 
                 learning_rate=1e-3, name='DeepSpeech2'):

        super(DeepSpeech2, self).__init__()

        self._vocab_size = vocab_size
        self._conv_filters = conv_filters
        self._conv_kernel_sizes = conv_kernel_sizes
        self._conv_strides = conv_strides
        self._recur_sizes = recur_sizes
        self._rnn_type = rnn_type
        self._bidirect_rnn = bidirect_rnn
        self._batch_norm = batch_norm
        self._learning_rate = learning_rate
        self._name = name

        self._conv_batch_norm = None

        with tf.name_scope(self._name):

            self._convolution = [tf.keras.layers.Conv1D(filters=conv_filters[i], 
                kernel_size=conv_kernel_sizes[i], strides=conv_strides[i],
                padding='valid', activation='relu', 
                name='conv1d_{}'.format(i)) for i in range(len(self._conv_filters))]

            if self._batch_norm:
                self._conv_batch_norm = tf.keras.layers.BatchNormalization(name='bn_conv_1d')

            if self._rnn_type == 'gru':
                rnn_init = tf.keras.layers.GRU
            elif self._rnn_type == 'lstm':
                rnn_init = tf.keras.layers.LSTM
            else:
                raise Exception("Invalid rnn_type: '{}' (must be 'lstm' or 'gru')"
                                .format(self._rnn_type))

            self._rnn = []
            for i, r in enumerate(self._recur_sizes):
                layer = rnn_init(r, activation='relu', return_sequences=True,
                    name='{}_{}'.format(self._rnn_type, i))
                if self._bidirect_rnn:
                    layer = tf.keras.layers.Bidirectional(layer)
                self._rnn.append(layer)
                if self._batch_norm:
                    self._rnn.append(tf.keras.layers.BatchNormalization())

            self._fc = tf.keras.layers.TimeDistributed(tf.keras.layers.Dense(
                self._vocab_size, name='fc', activation='linear'))

            self._optimizer = tf.keras.optimizers.Adam(lr=self._learning_rate)

    def __call__(self, specs):

        with tf.name_scope(self._name):

            feats = specs
            for layer in self._convolution:
                feats = layer(feats)

            if self._conv_batch_norm:
                feats = self._conv_batch_norm(feats)

            rnn_outputs = feats
            for layer in self._rnn:
                rnn_outputs = layer(rnn_outputs)

            outputs = self._fc(rnn_outputs)

            return tf.transpose(outputs, (1, 0, 2))

    @tf.function
    def train_step(self, specs, spec_lengths, labels, label_lengths):

        with tf.GradientTape() as tape:

            logits = self.__call__(specs)

            loss = tf.nn.ctc_loss(labels=labels, logits=logits,
                label_length=label_lengths, logit_length=spec_lengths)
            cost = tf.reduce_sum(loss)

            decoded, neg_sum_logits = tf.nn.ctc_greedy_decoder(logits, label_lengths)

            gradients = tape.gradient(cost, self.trainable_variables)
            self._optimizer.apply_gradients(zip(gradients, self.trainable_variables))

        return (decoded[0].indices, decoded[0].values, decoded[0].dense_shape), cost
我目前得到以下错误

ValueError: No gradients provided for any variable: ['DeepSpeech2/conv1d_0/kernel:0', 'DeepSpeech2/conv1d_0/bias:0', 'DeepSpeech2/bn_conv_1d/gamma:0', 'DeepSpeech2/bn_conv_1d/beta:0', 'DeepSpeech2/gru_0/kernel:0', 'DeepSpeech2/gru_0/recurrent_kernel:0', 'DeepSpeech2/gru_0/bias:0', 'DeepSpeech2/batch_normalization_v2/gamma:0', 'DeepSpeech2/batch_normalization_v2/beta:0', 'DeepSpeech2/time_distributed/kernel:0', 'DeepSpeech2/time_distributed/bias:0'].
该错误发生在将渐变应用于优化器的直线上。当我打印我的
梯度
变量时,它只是一个


据我所知,这个错误表明在图中没有从变量到损失的路径,但我不确定为什么会出现这种情况。任何帮助都将不胜感激

您确定正在监视您的
可训练变量吗<代码>磁带。监视的_变量()
应返回这些变量variables@borarak我把它打印出来,它是空的,但是我运行了
tape.watch(self.trainable\u variables)
并检查了
tape.watched\u variables()
,所有变量都在那里,但是我的梯度仍然是
你确定你的
自我。可训练的变量
正在被监视吗<代码>磁带。监视的_变量()应返回这些变量variables@borarak我把它打印出来,它是空的,但是我运行了
tape.watch(self.trainable\u variables)
并检查了
tape.watched\u variables()
,所有变量都在那里,但我的梯度仍然
None