Tensorflow LSTMBlockFusedCell的错误是否比LSTMCell高6%（测试）或者我是否在辍学时出错？_Tensorflow

Tensorflow LSTMBlockFusedCell的错误是否比LSTMCell高6%（测试）或者我是否在辍学时出错？

tensorflow

Tensorflow LSTMBlockFusedCell的错误是否比LSTMCell高6%（测试）或者我是否在辍学时出错？,tensorflow,Tensorflow,我为一个回归问题构建了一个简单的堆叠动态双向LSTM，其中包含LSTMCell、DropOutRapper、MultiRNNCell和双向动态模型Orig。20个周期后的测试绝对误差为2.89，训练时间为14.5小时然后我尝试了另一种实现模型_blockfused，它具有相同的结构，但使用blockfused组件，即tf.layers.dropout、LSTMBlockFusedCell、timereversedfusedn。对于3.6小时的模型，训练时间要短得多，但20个时间后的测试绝对误差

我为一个回归问题构建了一个简单的堆叠动态双向LSTM，其中包含LSTMCell、DropOutRapper、MultiRNNCell和双向动态模型Orig。20个周期后的测试绝对误差为2.89，训练时间为14.5小时

然后我尝试了另一种实现模型_blockfused，它具有相同的结构，但使用blockfused组件，即tf.layers.dropout、LSTMBlockFusedCell、timereversedfusedn。对于3.6小时的模型，训练时间要短得多，但20个时间后的测试绝对误差比3.06高出约6%

那么，我应该期望LSTMBlockFusedCell和LSTMCell之间的性能有6%的差异吗？或者我有没有犯过任何错误，尤其是在建立Model_blockfused时，我辍学了

以下是模型_Orig的简化代码：

以下是模型_的简化代码：

谢谢。

首先，您应该为fw和bw使用两个独立的tf.contrib.rnn.LSTMBlockFusedCell，更改下面的代码

cur_fw_BFcell_obj = tf.contrib.rnn.LSTMBlockFusedCell(num_units=LSTM_CELL_SIZE)
cur_bw_BFcell_obj = tf.contrib.rnn.TimeReversedFusedRNN(cur_fw_BFcell_obj)

为此：

cur_fw_BFcell_obj = tf.contrib.rnn.LSTMBlockFusedCell(num_units=LSTM_CELL_SIZE)
cur_bw_BFcell_obj_cell = tf.contrib.rnn.LSTMBlockFusedCell(num_units=LSTM_CELL_SIZE)
cur_bw_BFcell_obj = tf.contrib.rnn.TimeReversedFusedRNN(cur_bw_BFcell_obj_cell)

第二，在tf的api中，它说

前向层和后向层的组合输出用作下一层

下面的代码

fw_out_TM, fw_state = cur_fw_BFcell_obj(fw_out_TM, dtype=tf.float32, sequence_length=length)
bw_out_TM, bw_state = cur_bw_BFcell_obj(bw_out_TM, dtype=tf.float32, sequence_length=length)

应改为：

next_layer_input = tf.concat([fw_out_TM, bw_out_TM], axis=2)
fw_out_TM, fw_state = cur_fw_BFcell_obj(next_layer_input, dtype=tf.float32, sequence_length=length)
bw_out_TM, bw_state = cur_bw_BFcell_obj(next_layer_input, dtype=tf.float32, sequence_length=length)

fw_out_TM, fw_state = cur_fw_BFcell_obj(fw_out_TM, dtype=tf.float32, sequence_length=length)
bw_out_TM, bw_state = cur_bw_BFcell_obj(bw_out_TM, dtype=tf.float32, sequence_length=length)

next_layer_input = tf.concat([fw_out_TM, bw_out_TM], axis=2)
fw_out_TM, fw_state = cur_fw_BFcell_obj(next_layer_input, dtype=tf.float32, sequence_length=length)
bw_out_TM, bw_state = cur_bw_BFcell_obj(next_layer_input, dtype=tf.float32, sequence_length=length)