Tensorflow估计器API:记住上一批的LSTM状态,以便下一批使用动态批大小

Tensorflow估计器API:记住上一批的LSTM状态,以便下一批使用动态批大小,tensorflow,lstm,rnn,tensorflow-estimator,Tensorflow,Lstm,Rnn,Tensorflow Estimator,我知道类似的问题已经在stackoverflow和Internet上被问了好几次,但我无法找到以下问题的解决方案:我正在尝试在tensorflow及其估计器API中构建一个有状态LSTM模型。 我尝试了的解决方案,只要我使用静态的批量大小,它就可以工作。具有动态批次大小会导致以下问题: ValueError:初始值必须具有指定的形状: 张量(“DropOutrapperZeroState/MultiRNNCellZeroState/DropOutrapperZeroState/LSTMCellZ

我知道类似的问题已经在stackoverflow和Internet上被问了好几次,但我无法找到以下问题的解决方案:我正在尝试在tensorflow及其估计器API中构建一个有状态LSTM模型。 我尝试了的解决方案,只要我使用静态的
批量大小
,它就可以工作。具有动态批次大小会导致以下问题:

ValueError:初始值必须具有指定的形状: 张量(“DropOutrapperZeroState/MultiRNNCellZeroState/DropOutrapperZeroState/LSTMCellZeroState/zeros:0”, shape=(?,200),dtype=float32)

设置
tf.Variable(..,validate_shape=False)
只会将问题进一步向下移动:

回溯(最近一次呼叫最后一次):
文件“model.py”,第576行,在
tf.app.run(main=run\u实验)
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py”,第48行,正在运行
_系统出口(主(_sys.argv[:1]+标志_passthrough))
文件“model.py”,第137行,在run_实验中
hparams=参数#hparams
文件“/usr/local/lib/python2.7/dist packages/tensorflow/contrib/learn/python/learn/learn\u runner.py”,第210行,运行中
返回执行时间表(实验,时间表)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/contrib/learn/python/learn/learn\u runner.py”,执行计划中第47行
返回任务()
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experience.py”,第495行,列车和列车中
自动列车(延迟秒=0)
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experience.py”,第275行,列车中
挂钩=自身。\列车\监控器+额外\挂钩)
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/learn/python/learn/experience.py”,第660行,在调用列车中
挂钩=挂钩)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/estimator/estimator.py”,第241行,列车中
损耗=自身。\列车\模型(输入\ fn=输入\ fn,挂钩=挂钩)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/estimator/estimator.py”,第560行,列车模型
模型_fn_lib.ModeKeys.TRAIN)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/estimator/estimator.py”,第545行,在调用模型fn中
特征=特征,标签=标签,**kwargs)
文件“model.py”,第218行,在model_fn中
输出,状态=获取模型(特征,参数)
get_模型中第567行的文件“model.py”
模型=lstm(输入、参数)
文件“model.py”,第377行,在lstm中
输出,新状态=tf.nn.dynamic(多小区,输入=输入,初始状态=状态)
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py”,第574行,动态
dtype=dtype)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn.py”,第737行,在动态循环中
交换内存=交换内存)
while_循环中的文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/control_flow_ops.py”,第2770行
结果=context.BuildLoop(cond、body、loop\u vars、shape\u不变量)
BuildLoop中的文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/control\u flow\u ops.py”,第2599行
pred、body、原始循环变量、循环变量、形状不变量)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/control\u flow\u ops.py”,第2549行,在BuildLoop中
body\u result=body(*packeted\u vars\u for\u body)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn.py”,第722行,在时间步长中
(输出,新状态)=调用单元()
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py”,第708行,在
call_cell=lambda:cell(输入,状态)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn_cell_impl.py”,第752行,在调用中__
输出,新状态=自身单元(输入、状态、范围)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn_cell_impl.py”,第180行,在调用中__
返回超级(RNNCell,self)。\调用(输入,状态)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/layers/base.py”,第441行,在__
输出=自调用(输入,*args,**kwargs)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn_cell_impl.py”,第916行,在调用中
电流输入,新状态=单元(电流输入,电流状态)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn_cell_impl.py”,第752行,在调用中__
输出,新状态=自身单元(输入、状态、范围)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn_cell_impl.py”,第180行,在调用中__
返回超级(RNNCell,self)。\调用(输入,状态)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/layers/base.py”,第441行,在__
输出=自调用(输入,*args,**kwargs)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn_cell_impl.py”,第542行,在调用中
lstm_矩阵=_线性([inputs,m_prev],4*self._num_units,bias=True)
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/ops/rnn_cell_impl.py”,第1002行,在
raise VALUERROR(“线性应为二维参数:%s”%s)
ValueError:linear需要2D参数:[TensorShape([Dimension(None),Dimension(62)]),TensorShape(None)]
据我所知,无论如何,不建议使用不可训练变量(??),这就是我继续寻找其他解决方案的原因

现在我在我的
model\u fn
中使用了占位符和类似的东西(在github线程中也有建议):

def rnn_占位符(状态):
“”“将RNN状态张量转换为默认为零状态的占位符。”“”
如果isinstance(状态,tf.contrib.rnn.LSTMStateTuple):
c、 h=状态
c=tf.placeholder_,带有默认值(c,c.shape,c.op.name)
h=tf.placeh
class LSTMStateHook(tf.train.SessionRunHook):

 def __init__(self, params):
    self.init_states  = None
    self.current_state = np.zeros((params.rnn_layers, 2, params.batch_size, params.state_size))

 def before_run(self, run_context):
    run_args = tf.train.SessionRunArgs([tf.get_default_graph().get_tensor_by_name('LSTM/output_states:0')],{self.init_states:self.current_state,},)
    return run_args

 def after_run(self, run_context, run_values):
    self.current_state = run_values[0][0] //depends on your session run arguments!!!!!!!


 def begin(self):
    self.init_states = tf.get_default_graph().get_tensor_by_name('LSTM/init_states:0')
if self.stateful is True:
        init_states = multicell.zero_state(self.batch_size, tf.float32)
        init_states = tf.identity(init_states, "init_states")

        l = tf.unstack(init_states, axis=0)
        rnn_tuple_state = tuple([tf.nn.rnn_cell.LSTMStateTuple(l[idx][0], l[idx][1]) for idx in range(self.rnn_layers)])

    else:
        rnn_tuple_state = multicell.zero_state(self.batch_size, tf.float32)

# Unroll RNN
output, output_states = tf.nn.dynamic_rnn(multicell, inputs=inputs, initial_state = rnn_tuple_state)

if self.stateful is True:
  output_states = tf.identity(output_states, "output_states")
  return output