Python MultiRNN无法处理相同BasicLSTM单元格的列表_Python_Tensorflow_Machine Learning_Lstm_Recurrent Neural Network

Python MultiRNN无法处理相同BasicLSTM单元格的列表

python tensorflow machine-learning

Python MultiRNN无法处理相同BasicLSTM单元格的列表,python,tensorflow,machine-learning,lstm,recurrent-neural-network,Python,Tensorflow,Machine Learning,Lstm,Recurrent Neural Network,当使用相同的基本单元格时，以下代码失败（cell1，cell1）用于multirncell： import tensorflow as tf cell1 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False, name = "cell1") cell2 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False,name = "cell2") multi = tf.contrib.rnn.MultiRNNCell([cel

当使用相同的基本单元格时，以下代码失败

（cell1，cell1）

用于

multirncell

：

import tensorflow as tf
cell1 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False, name = "cell1")
cell2 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False,name = "cell2")
multi = tf.contrib.rnn.MultiRNNCell([cell1, cell1] )
init = multi.zero_state(64, tf.float32)
output,state = multi(tf.ones([64,512]),init)

其中，由于此代码正在使用

（cell1、cell2）

。但是

cell2

与

cell1

相同：

import tensorflow as tf
cell1 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False, name = "cell1")
cell2 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False,name = "cell2")
multi = tf.contrib.rnn.MultiRNNCell([cell1, cell2] )
init = multi.zero_state(64, tf.float32)
output,state = multi(tf.ones([64,512]),init)

我可以知道这两个代码示例的区别吗

一个错误是：

ValueError:尺寸必须相等，但对于输入形状为[64256]、[640512]的“多单元/单元0/单元1/MatMul_1”（op:“MatMul”）而言，尺寸必须为256和640

这是一个已知的限制（例如讨论）。问题是每个单元实例都会为权重创建一个内部变量。此变量的维数由隐藏大小（

在您的情况下）和此单元格实例接收的输入大小（

）确定。多次使用同一单元格时，必须确保输入在所有情况下都相同

考虑您的示例代码：

import tensorflow as tf
cell1 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False, name = "cell1")
cell2 = tf.contrib.rnn.BasicLSTMCell(128,reuse=False,name = "cell2")
multi = tf.contrib.rnn.MultiRNNCell([cell1, cell1] )
init = multi.zero_state(64, tf.float32)
output,state = multi(tf.ones([64,512]),init)

multi

中的两个单元格的输入将是

[…，640]

和

[…，256]

，因为

640=512+128

（单元格接收来自前一个单元格以及输入序列的输入）。因此，它们内部的权重矩阵将是[640512]
和[256，512]
（

这里实际上是

128*4

，而不是输入大小）

但您使用的是同一个单元格实例！Tensorflow尝试将其已有的矩阵与新输入匹配，但失败。另一方面，当您使用不同的实例时，tensorflow能够为不同的层实例化不同的矩阵，并正确地计算出形状