Tensorflow 解码器不接受双向编码器的输出
我正在尝试用Tensorflow实现一个编码器-解码器模型。编码器是一个双向单元Tensorflow 解码器不接受双向编码器的输出,tensorflow,lstm,bidirectional,seq2seq,attention-model,Tensorflow,Lstm,Bidirectional,Seq2seq,Attention Model,我正在尝试用Tensorflow实现一个编码器-解码器模型。编码器是一个双向单元 def encoder(hidden_units, encoder_embedding, sequence_length): forward_cell = tf.contrib.rnn.LSTMCell(hidden_units) backward_cell = tf.contrib.rnn.LSTMCell(hidden_units) bi_outputs, final_states =
def encoder(hidden_units, encoder_embedding, sequence_length):
forward_cell = tf.contrib.rnn.LSTMCell(hidden_units)
backward_cell = tf.contrib.rnn.LSTMCell(hidden_units)
bi_outputs, final_states = tf.nn.bidirectional_dynamic_rnn(forward_cell, backward_cell, encoder_embedding, sequence_length= sequence_length, dtype=tf.float32)
encoder_outputs = tf.concat(bi_outputs, 2)
forward_cell_state, backward_cell_state =final_states
cell_state_final = tf.concat([forward_cell_state.c, backward_cell_state.c],1)
hidden_state_final = tf.concat([forward_cell_state.h, backward_cell_state.h],1)
encoder_final_state = tf.nn.rnn_cell.LSTMStateTuple(c=cell_state_final, h=hidden_state_final)
return encoder_outputs, encoder_final_state
编码器和解码器之间出现问题。我得到了一个类似ValueError的错误:形状(?,42)和(12,21)不兼容
解码器具有注意机制,如下所示:
def decoder(decoder_embedding, vocab_size, hidden_units, sequence_length, encoder_output, encoder_state, batchsize):
projection_layer = Dense(vocab_size)
helper = tf.contrib.seq2seq.TrainingHelper(decoder_embedding, sequence_length=sequence_length)
# Decoder
decoder_cell = tf.contrib.rnn.LSTMCell(hidden_units)
# Attention Mechanis
attention_mechanism = tf.contrib.seq2seq.LuongAttention(hidden_units, encoder_output)
attn_cell = tf.contrib.seq2seq.AttentionWrapper(decoder_cell, attention_mechanism, attention_layer_size=hidden_units)
# Initial attention
attn_zero = attn_cell.zero_state(batch_size=batchsize, dtype=tf.float32)
ini_state = attn_zero.clone(cell_state=encoder_state)
decoder = tf.contrib.seq2seq.BasicDecoder(cell=attn_cell, initial_state=ini_state, helper=helper, output_layer=projection_layer)
decoder_outputs, _final_state, _final_sequence_lengths = tf.contrib.seq2seq.dynamic_decode(decoder)
return decoder_outputs
如何解决这一问题?问题是编码器的隐藏单元数量是解码器的两倍,您正试图吸引注意力。它将注意力能量(状态相似性)计算为解码器状态和所有编码器状态之间的点积,点积需要相同的维度 您有几个选择:
隐藏单元
=2×解码器的隐藏单元