Python 如何逐步使用LSTM,即使用新状态,而不改变LSTM单元格后面的图形部分?

Python 如何逐步使用LSTM,即使用新状态,而不改变LSTM单元格后面的图形部分?,python,tensorflow,lstm,Python,Tensorflow,Lstm,在强化学习环境中,我想使用LSTM。因此,我必须计算单个LSTM步骤(timesteps)。 现在,这通常看起来大致如下: inputx = tf.placeholder() lstm_cell = tf.nn.contrib.rnn.LSTMCell(hiddenunits) state = lstm_cell.zero_state for i in range(timesteps): lstm_out, state = lstm_cell(inputx, state) out

在强化学习环境中,我想使用LSTM。因此,我必须计算单个LSTM步骤(timesteps)。 现在,这通常看起来大致如下:

inputx = tf.placeholder()
lstm_cell = tf.nn.contrib.rnn.LSTMCell(hiddenunits)
state = lstm_cell.zero_state
for i in range(timesteps):
    lstm_out, state = lstm_cell(inputx, state)
    out = sess.run(lstm_out, feed_dict={inputx:my_input})
    my_input = my_environment.step(out) # returns the observation
通常,这种样式的状态更新工作得非常好,因为依赖于状态(即lstm_out)的所有内容也被重新定义。但是,考虑一个案例,其中一些更复杂的事情应该发生在LSTM的输出:

output1 = tf.someoperation1(out)
output2 = tf.someoperation2(out)
output3 = tf.someoperation3(out)
如果我想在每次迭代中得到output1、output2和output3,据我所知,我必须在每次迭代中再次告诉tensorflow如何计算它们:

inputx = tf.placeholder()
    lstm_cell = tf.nn.contrib.rnn.LSTMCell(hiddenunits)
    state = lstm_cell.zero_state
    for i in range(timesteps):
        lstm_out, state = lstm_cell(inputx, state)
        output1 = tf.someoperation1(out)
        output2 = tf.someoperation2(out)
        output3 = tf.someoperation3(out)
        out1, out2, out3 = session.run([output1, output2, output3], feed_dict={inputx:my_input})
        my_input = my_environment.step(out1, out2, out3) # returns the observation
这不仅对我来说很不方便,而且我还假设它在tensorflow图中充斥着来自同一操作的许多节点,这是不必要的。有什么好的解决办法吗

我已经看到,在函数中绑定output1、output2和output3至少可以提高便利性和可读性:

def some_function(lstm_output):
    output1 = tf.someoperation1(lstm_output)
    output2 = tf.someoperation2(lstm_output)
    output3 = tf.someoperation3(lstm_output)
    return output1, output2, output3

inputx = tf.placeholder()
    lstm_cell = tf.nn.contrib.rnn.LSTMCell(hiddenunits)
    state = lstm_cell.zero_state
    for i in range(timesteps):
        lstm_out, state = lstm_cell(inputx, state)
        out1, out2, out3 = some_function(lstm_out)
        out1, out2, out3 = session.run([output1, output2, output3], feed_dict={inputx:my_input})
        my_input = my_environment.step(out1, out2, out3) # returns the observation

但是这仍然有点不方便,如果你想在多个实例上使用这个模型,我想它也会在我的图中创建越来越多的垃圾?在我看来,可能有更方便的方法来实现这一点,在改变tensorflow的状态后,不必重新定义所有内容

您可以使用占位符在循环之前定义操作,然后仅使用
会话中的
提要
更新占位符值。run()

示例(未测试):

这样,您只需定义一次操作,并将每次迭代中更新的值传递给占位符

编辑 正如有人正确指出的那样,
状态
不是一个张量,而是一个
LSTMStateTuple
。有关LSTM状态的详细说明,请参见。因此,我们必须稍微修改代码:

inputx = tf.placeholder(...) # define size and type
lstm_cell = tf.nn.contrib.rnn.LSTMCell(hiddenunits)
zero_state = lstm_cell.zero_state(batch_size=10, dtype=tf.float32)
c_state_ph = tf.placeholder(tf.float32, shape=zero_state.c.shape) 
h_state_ph = tf.placeholder(tf.float32, shape=zero_state.h.shape) 
cell_state_ph = LSTMStateTuple(c_state_ph, h_state_ph)
state_val = session.run(zero_state)
c_state_val = state_val.c
h_state_val = state_val.h
lstm_out, state = lstm_cell(inputx, cell_state_ph)
out1, out2, out3 = some_function(lstm_out)
for i in range(timesteps):
    out1_val, out2_val, out3_val, state_val = session.run([out1, out2, out3, state], feed_dict={inputx: my_input, c_state_ph: c_state_val, h_state_ph: h_state_val})
    c_state_val=state_val.c
    h_state_val=state_val.h

非常感谢你的这个想法!实际上我想到了这一点,老实说,我直到现在才尝试过,因为状态通常不仅仅是张量,而是一个特定的对象(LSTMStateTuple())。这样的对象可以使用占位符吗?我想我可以设置state_is_tuple=False,那么它应该可以工作,但是对于()这是不推荐的。我想你是对的。如果lstm_cell(inputx,state_ph)返回的state对象是一个元组而不是一个张量,那么我们必须做一些额外的工作。我们可以从元组中提取各个张量,并为每个张量创建占位符。看到我的编辑,我很高兴听到!
inputx = tf.placeholder(...) # define size and type
lstm_cell = tf.nn.contrib.rnn.LSTMCell(hiddenunits)
zero_state = lstm_cell.zero_state(batch_size=10, dtype=tf.float32)
c_state_ph = tf.placeholder(tf.float32, shape=zero_state.c.shape) 
h_state_ph = tf.placeholder(tf.float32, shape=zero_state.h.shape) 
cell_state_ph = LSTMStateTuple(c_state_ph, h_state_ph)
state_val = session.run(zero_state)
c_state_val = state_val.c
h_state_val = state_val.h
lstm_out, state = lstm_cell(inputx, cell_state_ph)
out1, out2, out3 = some_function(lstm_out)
for i in range(timesteps):
    out1_val, out2_val, out3_val, state_val = session.run([out1, out2, out3, state], feed_dict={inputx: my_input, c_state_ph: c_state_val, h_state_ph: h_state_val})
    c_state_val=state_val.c
    h_state_val=state_val.h