Tensorflow 解码值始终为空
使用LSTM网络进行语音识别时,解码值始终为空。我训练了10000个样本4000个回合,但预测结果总是空白的。请帮忙Tensorflow 解码值始终为空,tensorflow,speech-recognition,lstm,speech-to-text,Tensorflow,Speech Recognition,Lstm,Speech To Text,使用LSTM网络进行语音识别时,解码值始终为空。我训练了10000个样本4000个回合,但预测结果总是空白的。请帮忙 class Wav2LetterLSTMModel(SpeechModel): def __init__(self, input_loader: BaseInputLoader, input_size: int, num_classes: int): super().__init__(input_loader, input_size, num_classes) de
class Wav2LetterLSTMModel(SpeechModel):
def __init__(self, input_loader: BaseInputLoader, input_size: int, num_classes: int):
super().__init__(input_loader, input_size, num_classes)
def _create_network(self, num_classes):
inputs = self.inputs
inputs, sequence_lengths, labels = self.input_loader.get_inputs()
XT = tf.transpose(inputs, [1, 0, 2]) # permute time_step_size and batch_size
XR = tf.reshape(XT, [-1, self.input_size]) # each row has input for each lstm cell (lstm_size=input_vec_size)
X_split = tf.split(XR, cellsize, 0) # split them to time_step_size (arrays)
lstm = rnn.BasicLSTMCell(cellsize, forget_bias=0.5, state_is_tuple=True)
outputs, _states = rnn.static_rnn(lstm, X_split, dtype=tf.float32)
return tf.transpose(outputs, (1, 0, 2))
decoded_id_paths=[评估。为解码中的路径提取_decoded_id(路径)]
用于评估中的标签ID。提取解码的标签ID(标签):
预期的\u str=speecht.词汇表.ids\u到\u句子(标签\u-ids)
如果冗长:
打印('expected:{}'。格式(expected_str))
对于解码\u id\u路径中的解码\u路径:
解码的\u id=下一个(解码的\u路径)
decoded_str=speecht.词汇表.ids_to_句子(decoded_ids)如果您能解释一下您执行了哪些调试步骤,以及您到底在哪里遇到了问题,这将非常有帮助。我怀疑如果你记录了一堆摘要并在Tensorboard中查看它们,它会告诉你很多。如果你解释一下你做了哪些调试步骤,以及你到底在哪里遇到了问题,这会很有帮助。我怀疑,如果你记录了一堆摘要,并在Tensorboard中查看它们,它会告诉你很多。
model = Wav2LetterLSTMModel(input_loader=speech_input,
input_size=input_size,
num_classes=speecht.vocabulary.SIZE + 1)
self.decoded, self.log_probabilities = tf.nn.ctc_greedy_decoder(self.logits,
self.sequence_lengths // 2,
merge_repeated=True)
decoded_ids_paths = [Evaluation.extract_decoded_ids(path) for path in decoded]
for label_ids in Evaluation.extract_decoded_ids(label):
expected_str = speecht.vocabulary.ids_to_sentence(label_ids)
if verbose:
print('expected: {}'.format(expected_str))
for decoded_path in decoded_ids_paths:
decoded_ids = next(decoded_path)
decoded_str = speecht.vocabulary.ids_to_sentence(decoded_ids) <- This line always return blank