Python Tensorflow中的字符RNN_Python_Tensorflow_Deep Learning_Recurrent Neural Network

Python Tensorflow中的字符RNN

python tensorflow deep-learning

Python Tensorflow中的字符RNN,python,tensorflow,deep-learning,recurrent-neural-network,Python,Tensorflow,Deep Learning,Recurrent Neural Network,我正在尝试使用一个简单的RNN在tensorflow中工作，但是我遇到了几个问题我现在要做的是，简单地运行RNN的前向传递，将LSTM作为其单元类型我搜集了一些新闻文章，想把它们输入RNN。我已将包含所有项目的串联的字符串拆分为字符，并将字符映射为整数。然后我有一个热编码的整数 data = [c for c in article] chars = list(set(data)) idx_chars = {i:ch for i,ch in enumerate(chars)} chars_id

我正在尝试使用一个简单的RNN在tensorflow中工作，但是我遇到了几个问题

我现在要做的是，简单地运行RNN的前向传递，将LSTM作为其单元类型

我搜集了一些新闻文章，想把它们输入RNN。我已将包含所有项目的串联的字符串拆分为字符，并将字符映射为整数。然后我有一个热编码的整数

data = [c for c in article]
chars = list(set(data))
idx_chars = {i:ch for i,ch in enumerate(chars)}
chars_idx = {ch:i for i,ch in enumerate(chars)}
int_data = [chars_idx[ch] for ch in data]

# config values
vocab_size = len(chars)
hidden_size = 100
seq_length = 25

# helper function to get one-hot encoding

def onehot(value):
    result = np.zeros(vocab_size)
    result[value] = 1
    return result

def vectorize_input(inputs):
    result = [onehot(x) for x in inputs]
    return result

input = vectorize_input(int_data[:25])

现在是tensorflow代码。我希望遍历数据中的所有字符，并在每次向前传递时使用25个字符。我的第一个问题是关于批量大小，如果我想按照我刚才提到的方式来做，我的批量大小是1，对吗？因此，与输入中的一个字符对应的每个向量都有shape[1，vocab_size]，在我的输入中有25个向量。所以我使用了以下张量：

seq_input = tf.placeholder(tf.int32, shape = [seq_length, 1, vocab_size])
targets = tf.placeholder(tf.int32, shape = [seq_length, 1, vocab_size])
inputs = [tf.reshape(i,(1,vocab_size)) for i in tf.split(0,seq_length,seq_input)]

我必须创建最后一个张量，因为这是rnn函数所期望的格式

然后我遇到了变量作用域的问题，我得到以下错误：

cell = rnn_cell.BasicLSTMCell(hidden_size, input_size = vocab_size)
# note: first argument of zero_state is the batch_size
initial_state = cell.zero_state(1, tf.float32)
outputs, state = rnn.rnn(cell, inputs, initial_state= initial_state)
sess = tf.Session()
sess.run([outputs, state], feed_dict = {inputs:input})

ValueError                                Traceback (most recent call last)
<ipython-input-90-449af38c387d> in <module>()
      7     # note: first argument of zero_state is supposed to be batch_size
      8     initial_state = cell.zero_state(1, tf.float32)
----> 9     outputs, state = rnn.rnn(cell, inputs, initial_state= initial_state)
     10 
     11 sess = tf.Session()

/Library/Python/2.7/site-packages/tensorflow/python/ops/rnn.pyc in rnn(cell, inputs, initial_state, dtype, sequence_length, scope)
    124             zero_output, state, call_cell)
    125       else:
--> 126         (output, state) = call_cell()
    127 
    128       outputs.append(output)

/Library/Python/2.7/site-packages/tensorflow/python/ops/rnn.pyc in <lambda>()
    117       if time > 0: vs.get_variable_scope().reuse_variables()
    118       # pylint: disable=cell-var-from-loop
--> 119       call_cell = lambda: cell(input_, state)
    120       # pylint: enable=cell-var-from-loop
    121       if sequence_length:

/Library/Python/2.7/site-packages/tensorflow/python/ops/rnn_cell.pyc in __call__(self, inputs, state, scope)
    200       # Parameters of gates are concatenated into one multiply for efficiency.
    201       c, h = array_ops.split(1, 2, state)
--> 202       concat = linear([inputs, h], 4 * self._num_units, True)
    203 
    204       # i = input_gate, j = new_input, f = forget_gate, o = output_gate

/Library/Python/2.7/site-packages/tensorflow/python/ops/rnn_cell.pyc in linear(args, output_size, bias, bias_start, scope)
    700   # Now the computation.
    701   with vs.variable_scope(scope or "Linear"):
--> 702     matrix = vs.get_variable("Matrix", [total_arg_size, output_size])
    703     if len(args) == 1:
    704       res = math_ops.matmul(args[0], matrix)

/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(name, shape, dtype, initializer, trainable, collections)
    254   return get_variable_scope().get_variable(_get_default_variable_store(), name,
    255                                            shape, dtype, initializer,
--> 256                                            trainable, collections)
    257 
    258 

/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, var_store, name, shape, dtype, initializer, trainable, collections)
    186     with ops.name_scope(None):
    187       return var_store.get_variable(full_name, shape, dtype, initializer,
--> 188                                     self.reuse, trainable, collections)
    189 
    190 

/Library/Python/2.7/site-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, name, shape, dtype, initializer, reuse, trainable, collections)
     99       if should_check and not reuse:
    100         raise ValueError("Over-sharing: Variable %s already exists, disallowed."
--> 101                          " Did you mean to set reuse=True in VarScope?" % name)
    102       found_var = self._vars[name]
    103       if not shape.is_compatible_with(found_var.get_shape()):

ValueError: Over-sharing: Variable forward/RNN/BasicLSTMCell/Linear/Matrix already exists, disallowed. Did you mean to set reuse=True in VarScope?

cell=rnn\u cell.BasicLSTMCell（隐藏大小，输入大小=vocab\u大小）
#注意：zero_状态的第一个参数是批次大小
初始状态=单元格零状态（1，tf.float32）
输出，状态=rnn.rnn（单元，输入，初始状态=初始状态）
sess=tf.Session（）
run（[outputs，state]，feed_dict={inputs:input}）
ValueError回溯（最近一次调用上次）
在（）
7#注意：零状态的第一个参数应该是批量大小
8初始状态=单元格零状态（1，tf.float32）
---->9输出，状态=rnn.rnn（单元，输入，初始状态=初始状态）
10
11 sess=tf.Session（）
/rnn中的Library/Python/2.7/site-packages/tensorflow/Python/ops/rnn.pyc（单元格、输入、初始状态、数据类型、序列长度、范围）
124零输出、状态、调用单元）
125其他：
-->126（输出，状态）=调用单元（）
127
128个输出。追加（输出）
/Library/Python/2.7/site-packages/tensorflow/Python/ops/rnn.pyc in（）
117如果时间>0:vs.获取变量范围（）.重用变量（）
118#pylint:disable=来自循环的单元格变量
-->119调用单元=λ：单元（输入单元，状态）
120#pylint:enable=来自循环的单元变量
121如果序列长度：
/Library/Python/2.7/site-packages/tensorflow/Python/ops/rnn_cell.pyc in_____调用（self、input、state、scope）
为了提高效率，将200个门参数串联成一个乘法器。
201 c，h=阵列运算拆分（1，2，状态）
-->202 concat=线性（[输入，h]，4*自数值单位，真）
203
204#i=输入门，j=新输入，f=忘记门，o=输出门
/Library/Python/2.7/site-packages/tensorflow/Python/ops/rnn_cell.pyc（参数、输出大小、偏差、偏差开始、范围）
700#现在开始计算。
701与变量范围（范围或“线性”）：
-->702矩阵=vs.get_变量（“矩阵”，[总参数大小，输出大小]）
703如果len（args）==1：
704 res=数学运算matmul（参数[0]，矩阵）
/get_变量（名称、形状、数据类型、初始值设定项、可训练、集合）中的Library/Python/2.7/site-packages/tensorflow/Python/ops/variable_scope.pyc
254返回get_variable_scope（）。get_variable（_get_default_variable_store（），name，
255形状、数据类型、初始值设定项，
-->256（可培训，收集）
257
258
/get_变量（self、var_存储、名称、形状、数据类型、初始值设定项、可培训、集合）中的Library/Python/2.7/site-packages/tensorflow/Python/ops/variable_scope.pyc
186带有ops.name\U范围（无）：
187返回变量存储。获取变量（全名、形状、数据类型、初始值设定项、，
-->188自我重用、可培训、收集）
189
190
/get_变量中的Library/Python/2.7/site-packages/tensorflow/Python/ops/variable_scope.pyc（self、name、shape、dtype、initializer、reuse、trainable、collections）
99如果应检查且不应重复使用：
100 raise VALUERROR（“过度共享：变量%s已存在，不允许。”
-->101“您的意思是在VarScope中设置reuse=True吗？”%name）
102已找到\u var=self.\u vars[名称]
103如果不是shape.is_与（found_var.get_shape（））兼容：
ValueError:过度共享：变量forward/RNN/BasicLSTMCell/Linear/Matrix已存在，不允许。您的意思是在VarScope中设置reuse=True吗？

我不知道为什么会出现这个错误，因为我实际上没有在代码中指定任何变量，这些变量只是在rnn和rnn_单元格函数中创建的，有人能告诉我如何修复这个错误吗

我目前遇到的另一个错误是类型错误，因为我的输入是tf.int32类型的，但是在LSTM中创建的隐藏层是tf.float32类型的，而rnn_cell.py代码中的线性函数将这两个张量连接起来，并将它们乘以权重矩阵。为什么这是不可能的，我假设输入是一个热编码的，因此具有int32类型是比较常见的

一般来说，在训练字符RNN时，这种方法的批大小是否为1标准？我只看过Andrej Karpathy的代码，他在基本numpy中训练一个字符rnn，他使用相同的过程，他只是按照长度为25的顺序浏览整个文本。下面是代码：

我几乎可以肯定会出现此错误，因为在IPython会话中对相同命令的现有（可能是部分）执行留下了一些变量。最简单的方法是在运行此单元格之前运行

tf.reset\u default\u graph（）

。（在运行此函数后，您必须重新创建

seq_input

、

targets

和

input

张量，因为所有现有的张量都将无效。）