Python Tensorflow RNN单元具有不同的权重_Python_Python 3.x_Tensorflow_Neural Network_Rnn

Python Tensorflow RNN单元具有不同的权重

python python-3.x tensorflow neural-network

Python Tensorflow RNN单元具有不同的权重,python,python-3.x,tensorflow,neural-network,rnn,Python,Python 3.x,Tensorflow,Neural Network,Rnn,根据这里的教程，我正在尝试用tensorflow编写一个简单的RNN：（我使用的是一个简单的RNN单元，而不是GRU，并且没有使用dropout）我很困惑，因为序列中的不同RNN单元似乎被分配了不同的权重。如果我运行以下代码 import tensorflow as tf seq_length = 3 n_h = 100 # Number of hidden units n_x = 26 # Size of input layer n_y = 26 # Size of ou

根据这里的教程，我正在尝试用tensorflow编写一个简单的RNN：（我使用的是一个简单的RNN单元，而不是GRU，并且没有使用dropout）

我很困惑，因为序列中的不同RNN单元似乎被分配了不同的权重。如果我运行以下代码

import tensorflow as tf

seq_length = 3
n_h = 100   # Number of hidden units
n_x = 26    # Size of input layer
n_y = 26    # Size of output layer

inputs = tf.placeholder(tf.float32, [None, seq_length, n_x])

cells = []
for _ in range(seq_length):
    cell = tf.contrib.rnn.BasicRNNCell(n_h)
    cells.append(cell)
multi_rnn_cell = tf.contrib.rnn.MultiRNNCell(cells)

initial_state = tf.placeholder(tf.float32, [None, n_h])

outputs_h, output_final_state = tf.nn.dynamic_rnn(multi_rnn_cell, inputs, dtype=tf.float32)

sess = tf.Session()
sess.run(tf.global_variables_initializer())

print('Trainable variables:')
for v in tf.trainable_variables():
    print(v)

如果在python 3中运行此操作，将获得以下输出：

Trainable variables:
<tf.Variable 'rnn/multi_rnn_cell/cell_0/basic_rnn_cell/kernel:0' shape=(126, 100) dtype=float32_ref>
<tf.Variable 'rnn/multi_rnn_cell/cell_0/basic_rnn_cell/bias:0' shape=(100,) dtype=float32_ref>
<tf.Variable 'rnn/multi_rnn_cell/cell_1/basic_rnn_cell/kernel:0' shape=(200, 100) dtype=float32_ref>
<tf.Variable 'rnn/multi_rnn_cell/cell_1/basic_rnn_cell/bias:0' shape=(100,) dtype=float32_ref>
<tf.Variable 'rnn/multi_rnn_cell/cell_2/basic_rnn_cell/kernel:0' shape=(200, 100) dtype=float32_ref>
<tf.Variable 'rnn/multi_rnn_cell/cell_2/basic_rnn_cell/bias:0' shape=(100,) dtype=float32_ref>

可训练变量：

首先，这不是我想要的——RNN需要在每一层从输入到隐藏和从隐藏到隐藏具有相同的权重

其次，我真的不明白为什么我会得到这些独立的变量。如果我看一下，它看起来像BasicRNNCell应该调用

\u linear

，它应该查找是否有一个名为

\u WEIGHTS\u variable\u name

（全局设置为

“kernel”

）的变量，如果有，就使用它。我不明白

“kernel”

是如何被修饰成

“rnn/multi\u rnn\u cell/cell\u 0/basic\u rnn\u cell/kernel:0”

如果有人能解释我做错了什么，我将不胜感激。

请注意区分两种不同的东西：递归神经网络的层数和此RNN通过时间反向传播算法展开以处理序列长度的时间

在代码中：

```
MultiCellRNN
```
负责创建一个3层RNN（您在那里创建了三层，MultiCellRNN只是一个包装器，以便更轻松地处理它们）
```
tf.nn.dynamic\u rnn
```
负责按照与序列长度相关的次数展开这三层网络

注意区分两种不同的东西：递归神经网络的层数和此RNN被处理序列长度的反向传播时间算法展开的时间

在代码中：

```
MultiCellRNN
```
负责创建一个3层RNN（您在那里创建了三层，MultiCellRNN只是一个包装器，以便更轻松地处理它们）
```
tf.nn.dynamic\u rnn
```
负责按照与序列长度相关的次数展开这三层网络

我现在明白了！谢谢你，你已经指出了这一点，这是显而易见的。我现在明白了！谢谢你，现在你已经指出了这一点，这是显而易见的。