Python 预期激活_1有3个维度,但得到了具有形状的数组(12 6984,67)

Python 预期激活_1有3个维度,但得到了具有形状的数组(12 6984,67),python,keras,Python,Keras,编写一个模型,尝试使用LSTM根据示例生成真实的文本 下面是代码的要点 # ... path = 'lyrics.txt' with io.open(path, encoding='utf-8') as f: text = f.read().lower() print('corpus length:', len(text)) chars = sorted(list(set(text))) print('total chars:', len(chars)) char_indices =

编写一个模型,尝试使用LSTM根据示例生成真实的文本

下面是代码的要点

# ...
path = 'lyrics.txt'
with io.open(path, encoding='utf-8') as f:
    text = f.read().lower()
print('corpus length:', len(text))

chars = sorted(list(set(text)))
print('total chars:', len(chars))
char_indices = dict((c, i) for i, c in enumerate(chars))
indices_char = dict((i, c) for i, c in enumerate(chars))

# cut the text in semi-redundant sequences of maxlen characters
maxlen = 140

step = 3
sentences = []
next_chars = []
for i in range(0, len(text) - maxlen, step):
    sentences.append(text[i: i + maxlen])
    next_chars.append(text[i + maxlen])
print('nb sequences:', len(sentences))

print('Vectorization...')
x = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)
y = np.zeros((len(sentences), len(chars)), dtype=np.bool)
for i, sentence in enumerate(sentences):
    for t, char in enumerate(sentence):
        x[i, t, char_indices[char]] = 1
    y[i, char_indices[next_chars[i]]] = 1


# build the model: a single LSTM
print('Build model...')
model = Sequential()
model.add(LSTM(128, dropout_W=0.5, return_sequences=True, input_shape=(maxlen, len(chars))))
model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
model.add(Dense(len(chars)))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam')
编辑它以尝试查看叠加多个LSTM的结果,从而得到此错误

Using TensorFlow backend.
corpus length: 381090
total chars: 67
nb sequences: 126984
Vectorization...
Build model...
char_lstm.py:55: UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(128, return_sequences=True, drop
out=0.5, input_shape=(140, 67))`
  model.add(LSTM(128, dropout_W=0.5, return_sequences=True, input_shape=(maxlen, len(chars))))
char_lstm.py:56: UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(128, return_sequences=True, drop
out=0.5)`
  model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
char_lstm.py:57: UserWarning: Update your `LSTM` call to the Keras 2 API: `LSTM(128, return_sequences=True, drop
out=0.5)`
  model.add(LSTM(128, dropout_W=0.5, return_sequences=True))
Traceback (most recent call last):
  File "char_lstm.py", line 110, in <module>
    callbacks=[print_callback])
  File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 1002, in fit
    validation_steps=validation_steps)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1630, in fit
    batch_size=batch_size)
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1480, in _standardize_user_data
    exception_prefix='target')
  File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 113, in _standardize_input_data
    'with shape ' + str(data_shape))
ValueError: Error when checking target: expected activation_1 to have 3 dimensions, but got array with shape (12
6984, 67)

相信可能是最后一层model.addDenselenchars可能是bug的来源,我知道代码的作用。但是在黑暗中多次拍摄之后,需要找到一个合适的解决方案,更重要的是要了解解决方案与bug的联系。

你很接近,问题在于Denselenchars,因为你在最后一次LSTM中也使用了return_sequences=True,你实际上是在返回一个形状批次大小的3D张量maxlen,128。现在稠密和softmax都可以处理高维张量,它们在最后维轴=-1上运行,但这也会导致它们返回序列。您有一个多对多模型,而您的数据是多对一的。您有两种选择:

您可以从最后一个LSTM中删除返回序列,以将上下文、过去的标记压缩为大小为128的单个向量表示,然后基于此进行预测。 若你们坚持要从所有过去的单词中获取信息,那个么你们需要在传递到稠密预测之前将其展平。
顺便说一句,您可以使用Denselenchars,activation='softmax'在一行中实现相同的效果。

谢谢您的详细回答。