Tensorflow 解码值始终为空_Tensorflow_Speech Recognition_Lstm_Speech To Text

Tensorflow 解码值始终为空

tensorflow speech-recognition

Tensorflow 解码值始终为空,tensorflow,speech-recognition,lstm,speech-to-text,Tensorflow,Speech Recognition,Lstm,Speech To Text,使用LSTM网络进行语音识别时，解码值始终为空。我训练了10000个样本4000个回合，但预测结果总是空白的。请帮忙 class Wav2LetterLSTMModel(SpeechModel): def __init__(self, input_loader: BaseInputLoader, input_size: int, num_classes: int): super().__init__(input_loader, input_size, num_classes) de

使用LSTM网络进行语音识别时，解码值始终为空。我训练了10000个样本4000个回合，但预测结果总是空白的。请帮忙

class Wav2LetterLSTMModel(SpeechModel):

     def __init__(self, input_loader: BaseInputLoader, input_size: int, num_classes: int):
super().__init__(input_loader, input_size, num_classes)

def _create_network(self, num_classes): 
    inputs = self.inputs       
    inputs, sequence_lengths, labels = self.input_loader.get_inputs() 

    XT = tf.transpose(inputs, [1, 0, 2])  # permute time_step_size and batch_size
    XR = tf.reshape(XT, [-1, self.input_size]) # each row has input for each lstm cell (lstm_size=input_vec_size)
    X_split = tf.split(XR, cellsize, 0) # split them to time_step_size (arrays)

    lstm = rnn.BasicLSTMCell(cellsize, forget_bias=0.5, state_is_tuple=True)
    outputs, _states = rnn.static_rnn(lstm, X_split, dtype=tf.float32)
return tf.transpose(outputs, (1, 0, 2))

decoded_id_paths=[评估。为解码中的路径提取_decoded_id（路径）]
用于评估中的标签ID。提取解码的标签ID（标签）：
预期的\u str=speecht.词汇表.ids\u到\u句子（标签\u-ids）
如果冗长：
打印（'expected:{}'。格式（expected_str））
对于解码\u id\u路径中的解码\u路径：
解码的\u id=下一个（解码的\u路径）
decoded_str=speecht.词汇表.ids_to_句子（decoded_ids）如果您能解释一下您执行了哪些调试步骤，以及您到底在哪里遇到了问题，这将非常有帮助。我怀疑如果你记录了一堆摘要并在Tensorboard中查看它们，它会告诉你很多。如果你解释一下你做了哪些调试步骤，以及你到底在哪里遇到了问题，这会很有帮助。我怀疑，如果你记录了一堆摘要，并在Tensorboard中查看它们，它会告诉你很多。
model = Wav2LetterLSTMModel(input_loader=speech_input,
                      input_size=input_size,
                      num_classes=speecht.vocabulary.SIZE + 1)

self.decoded, self.log_probabilities = tf.nn.ctc_greedy_decoder(self.logits,
                                                                    self.sequence_lengths // 2,
                                                                    merge_repeated=True)

decoded_ids_paths = [Evaluation.extract_decoded_ids(path) for path in decoded]
for label_ids in Evaluation.extract_decoded_ids(label):
  expected_str = speecht.vocabulary.ids_to_sentence(label_ids)
  if verbose:
    print('expected: {}'.format(expected_str))
  for decoded_path in decoded_ids_paths:
    decoded_ids = next(decoded_path)
    decoded_str = speecht.vocabulary.ids_to_sentence(decoded_ids) <- This line always return blank