Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用glstm(组LSTM)单元在tensorflow中构建双向rnn_Python_Tensorflow_Ocr - Fatal编程技术网

Python 使用glstm(组LSTM)单元在tensorflow中构建双向rnn

Python 使用glstm(组LSTM)单元在tensorflow中构建双向rnn,python,tensorflow,ocr,Python,Tensorflow,Ocr,我正在使用cnn+lstm+ctc网络(基于)进行中文场景文本识别。对于大量的等级(3500+),网络很难训练。我听说使用组LSTM(、O.Kuchaiev和B.Ginsburg“LSTM网络的因式分解技巧”,ICLR 2017研讨会)可以减少参数数量并加快培训,所以我尝试在代码中使用它 我使用两层双向lstm。这是使用tf.contrib.rnn.LSTMCell的原始代码 rnn_outputs, _, _ = tf.contrib.rnn.stack_bidirectional_dyna

我正在使用cnn+lstm+ctc网络(基于)进行中文场景文本识别。对于大量的等级(3500+),网络很难训练。我听说使用组LSTM(、O.Kuchaiev和B.Ginsburg“LSTM网络的因式分解技巧”,ICLR 2017研讨会)可以减少参数数量并加快培训,所以我尝试在代码中使用它

我使用两层双向lstm。这是使用tf.contrib.rnn.LSTMCell的原始代码

rnn_outputs, _, _ = 
tf.contrib.rnn.stack_bidirectional_dynamic_rnn(
[tf.contrib.rnn.LSTMCell(num_units=self.num_hidden, state_is_tuple=True) for _ in range(self.num_layers)],
[tf.contrib.rnn.LSTMCell(num_units=self.num_hidden, state_is_tuple=True) for _ in range(self.num_layers)], 
self.rnn_inputs, dtype=tf.float32, sequence_length=self.rnn_seq_len, scope='BDDLSTM')
训练很慢。100小时后,测试集上的预测acc仍为39%

现在我想使用tf.contrib.rnn.GLSTMCell。当我用这个GLSTMCell替换LSTMCell时

rnn_outputs, _, _ = tf.contrib.rnn.stack_bidirectional_dynamic_rnn(
[tf.contrib.rnn.GLSTMCell(num_units=self.num_hidden, num_proj=self.num_proj, number_of_groups=4) for _ in range(self.num_layers)],
[tf.contrib.rnn.GLSTMCell(num_units=self.num_hidden, num_proj=self.num_proj, number_of_groups=4) for _ in range(self.num_layers)],
self.rnn_inputs, dtype=tf.float32, sequence_length=self.rnn_seq_len, scope='BDDLSTM')
我得到以下错误

/home/frisasz/miniconda2/envs/dl/bin/python "/media/frisasz/DATA/FSZ_Work/deep learning/IDOCR_/work/train.py"
Traceback (most recent call last):
  File "/media/frisasz/DATA/FSZ_Work/deep learning/IDOCR_/work/train.py", line 171, in <module>
    train(train_dir='/media/frisasz/Windows/40T/', val_dir='../../0000/40V/')
  File "/media/frisasz/DATA/FSZ_Work/deep learning/IDOCR_/work/train.py", line 41, in train
    FLAGS.momentum)
  File "/media/frisasz/DATA/FSZ_Work/deep learning/IDOCR_/work/model.py", line 61, in __init__
    self.logits = self.rnn_net()
  File "/media/frisasz/DATA/FSZ_Work/deep learning/IDOCR_/work/model.py", line 278, in rnn_net
    self.rnn_inputs, dtype=tf.float32, sequence_length=self.rnn_seq_len, scope='BDDLSTM')
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/rnn.py", line 220, in stack_bidirectional_dynamic_rnn
    dtype=dtype)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 375, in bidirectional_dynamic_rnn
    time_major=time_major, scope=fw_scope)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 574, in dynamic_rnn
    dtype=dtype)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 737, in _dynamic_rnn_loop
    swap_memory=swap_memory)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2770, in while_loop
    result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2599, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/control_flow_ops.py", line 2549, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 720, in _time_step
    skip_conditionals=True)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 206, in _rnn_step
    new_output, new_state = call_cell()
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py", line 708, in <lambda>
    call_cell = lambda: cell(input_t, state)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 180, in __call__
    return super(RNNCell, self).__call__(inputs, state)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 441, in __call__
    outputs = self.call(inputs, *args, **kwargs)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/rnn_cell.py", line 2054, in call
    R_k = _linear(x_g_id, 4 * self._group_shape[1], bias=False)
  File "/home/frisasz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn_cell_impl.py", line 1005, in _linear
    "but saw %s" % (shape, shape[1]))
ValueError: linear expects shape[1] to be provided for shape (?, ?), but saw ?

Process finished with exit code 1
/home/frisasz/miniconda2/envs/dl/bin/python”/media/frisasz/DATA/FSZ\u-Work/deep-learning/IDOCR\uwork/train.py“
回溯(最近一次呼叫最后一次):
文件“/media/frisaz/DATA/FSZ_-Work/deep-learning/IDOCR_/Work/train.py”,第171行,在
列车(列车运行方向='/media/frisasz/Windows/40T/',val列车运行方向='../../0000/40V/'))
文件“/media/frisaz/DATA/FSZ_Work/deep learning/IDOCR_/Work/train.py”,第41行,列车中
(四、势头)
文件“/media/frisaz/DATA/FSZ_-Work/deep-learning/IDOCR_-Work/model.py”,第61行,在_-init中__
self.logits=self.rnn_net()
文件“/media/frisaz/DATA/FSZ_-Work/deep-learning/IDOCR_/Work/model.py”,第278行,rnn_-net
self.rnn_输入,dtype=tf.float32,sequence_length=self.rnn_seq_len,scope='BDDLSTM')
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/contrib/rnn/python/ops/rnn.py”,第220行,堆栈中的动态
dtype=dtype)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/rnn.py”,第375行,双向动态
时间专业=时间专业,范围=fw范围)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn.py”,第574行,动态
dtype=dtype)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/rnn.py”,第737行,在动态循环中
交换内存=交换内存)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/control\u flow\u ops.py”,第2770行,在while\u循环中
结果=context.BuildLoop(cond、body、loop\u vars、shape\u不变量)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/control_flow_ops.py”,第2599行,在BuildLoop中
pred、body、原始循环变量、循环变量、形状不变量)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/control_flow_ops.py”,第2549行,在BuildLoop中
body\u result=body(*packeted\u vars\u for\u body)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/rnn.py”,第720行,按时间步长
跳过(条件=真)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/rnn.py”,第206行,步骤
新的\u输出,新的\u状态=调用\u单元()
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/rnn.py”,第708行,在
call_cell=lambda:cell(输入,状态)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/ops/rnn_cell_impl.py”,第180行,在调用中__
返回超级(RNNCell,self)。\调用(输入,状态)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/python/layers/base.py”,第441行,在__
输出=自调用(输入,*args,**kwargs)
文件“/home/frisasz/miniconda2/envs/dl/lib/python2.7/site packages/tensorflow/contrib/rnn/python/ops/rnn_cell.py”,第2054行,在调用中
R_k=_线性(x_g_id,4*self._group_shape[1],bias=False)
文件“/home/frisaz/miniconda2/envs/dl/lib/python2.7/site-packages/tensorflow/python/ops/rnn_-cell_-impl.py”,第1005行,以线性形式
“但看到了%s”%(形状,形状[1]))
ValueError:linear希望为形状(?,)提供形状[1],但saw?
进程已完成,退出代码为1

我不确定GLSTMCell是否可以简单地替换tf.contrib.rnn.stack\u directional\u dynamic\u rnn()中的LSTMCell(或其他有助于构建rnn的函数)。我没有找到任何使用GLSTMCell的例子。有人知道使用GLSTMCell构建双向rnn的正确方法吗?

我尝试使用双向动态rnn构建双向GLSTM时遇到了完全相同的错误

在我的例子中,问题来自这样一个事实:GLSTM只能在以静态方式定义时使用:当计算图形时,不能有未定义的形状参数(例如batch_size)


因此,尝试在图形中定义所有将在GLSTM单元中的某个点结束的形状,它应该可以正常工作

谢谢,我决定不使用glstm。我用另一种方法加快了训练。