Tensorflow tf.keras.layers.LSTM初始_状态的输入形状

Tensorflow tf.keras.layers.LSTM初始_状态的输入形状,tensorflow,keras,lstm,tf.keras,Tensorflow,Keras,Lstm,Tf.keras,在这里,我想构建一个非常基本和简单的字符智能RNN 假设我的数据集是这样嵌入的: import numpy as np batch_1 = np.array([[1, 2, ...., 20], [21, .....,40], [41,....,60], [61,...., 80]]) batch_2 = np.array([[...], [...], [...], [...]]) import tensorflow as tf batch_size = 4 steps_number = 2

在这里,我想构建一个非常基本和简单的字符智能RNN

假设我的数据集是这样嵌入的:

import numpy as np
 batch_1 = np.array([[1, 2, ...., 20], [21, .....,40], [41,....,60], [61,...., 80]])
 batch_2 = np.array([[...], [...], [...], [...]])
import tensorflow as tf
batch_size = 4
steps_number = 20
hidden_units = 100
keep_prob = 0.5
dim = tf.zeros([batch_size, hidden_units])
input_data = tf.keras.layers.Input(shape=(1, steps_number), batch_size=batch_size)
hidden_1, state_h, state_c = tf.keras.layers.LSTM(units=hidden_units, stateful=True, dropout=keep_prob, return_state=True)(input_data, initial_state=[dim, dim], training=True)
hideen_2 = tf.keras.layers.LSTM(units=hidden_units, stateful=True, dropout=keep_prob, return_state=False)(hidden_1, initial_state=[state_h, state_c], training=True)
hidden3 = tf.keras.layers.Dense(10, activation='relu')(hidden_1)
output = tf.keras.layers.Dense(1, activation='sigmoid')(hidden3)
model = tf.keras.models.Model(input_data, output)
在这里,我在隐藏的_2层中得到了这个错误: ValueError:形状(100,4)必须至少有3个等级


问题是隐藏层大小的输出应该是[批次大小、步数、隐藏单位]

这是可行的解决方案,但是,我不明白为什么我必须根据柱阵列指定输入形状:

形状=(步数,1)而不是(1,步数)


因为
tf.keras.layers.LSTM
接受3D张量,
[批大小、时间步长、数量特征]
。尽管OP没有明确提到功能的数量,但当您定义
shape=(步骤编号,1)
时,您将其视为一个。
import tensorflow as tf
batch_size = 4
steps_number = 20
hidden_units = 100
keep_prob = 0.5
dim = tf.zeros([batch_size, hidden_units])
input_data = tf.keras.layers.Input(shape=(steps_number,1), batch_size=batch_size)
hidden_1, state_h, state_c = tf.keras.layers.LSTM(units=hidden_units, stateful=True, dropout=keep_prob, return_state=True, return_sequences=True)(input_data, initial_state=[dim, dim], training=True)
print(hidden_1.get_shape().as_list)
hideen_2 = tf.keras.layers.LSTM(units=hidden_units, stateful=True, dropout=keep_prob, return_state=False)(hidden_1, initial_state=[state_h, state_c], training=True)
hidden3 = tf.keras.layers.Dense(10, activation='relu')(hidden_1)
output = tf.keras.layers.Dense(1, activation='sigmoid')(hidden3)
model = tf.keras.models.Model(input_data, output)