Python 3.x 图像的CNN-RNN集成

Python 3.x 图像的CNN-RNN集成,python-3.x,tensorflow,conv-neural-network,lstm,tflearn,Python 3.x,Tensorflow,Conv Neural Network,Lstm,Tflearn,我试图通过以下代码为MNIST图像集成CNN和LSTM: from __future__ import division, print_function, absolute_import import tensorflow as tf import tflearn import numpy as np from tflearn.layers.core import input_data, dropout, fully_connected from tflearn.layers.conv impo

我试图通过以下代码为MNIST图像集成CNN和LSTM:

from __future__ import division, print_function, absolute_import
import tensorflow as tf
import tflearn
import numpy as np
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.estimator import regression

import tflearn.datasets.mnist as mnist
height = 128
width = 128
X, Y, testX, testY = mnist.load_data(one_hot=True)
X = X.reshape([-1, 28, 28, 1])
testX = testX.reshape([-1, 28, 28, 1])

# Building convolutional network
network = tflearn.input_data(shape=[None, 28, 28,1], name='input')
network = tflearn.conv_2d(network, 32, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = tflearn.conv_2d(network, 64, 3, activation='relu',regularizer="L2")
network = tflearn.max_pool_2d(network, 2)
network = tflearn.local_response_normalization(network)
network = fully_connected(network, 128, activation='tanh')
network = dropout(network, 0.8)
network = fully_connected(network, 256, activation='tanh')
network = dropout(network, 0.8)
network = tflearn.reshape(network, [-1, 1, 28*28])
#lstm
network = tflearn.lstm(network, 128, return_seq=True)
network = tflearn.lstm(network, 128)
network = tflearn.fully_connected(network, 10, activation='softmax')
network = tflearn.regression(network, optimizer='adam',
                     loss='categorical_crossentropy', name='target')

#train
model = tflearn.DNN(network, tensorboard_verbose=0)
model.fit(X, Y, n_epoch=1, validation_set=0.1, show_metric=True,snapshot_step=100)
CNN接受4D张量,LSTM接受3D。因此,我通过:network=tflearn.reforme(network,[-1,1,28*28])对网络进行了重塑

但运行时会出现以下错误:

InvalidArgumentError(回溯见上文):重塑的输入是 具有16384个值的张量,但请求的形状需要一个倍数 共784个[[节点:重塑/重塑=重塑[T=DT_浮点, t形状=DT_INT32, _device=“/job:localhost/replica:0/task:0/cpu:0”](退出\u 1/cond/Merge,重塑/重塑/塑造)]]


我不清楚为什么他们需要一个16384大小的张量,即使我硬编码128*128它仍然不起作用!我根本无法继续。

错误在这一行:

network = tflearn.reshape(network, [-1, 1, 28*28])
上一个FC层具有
n_units=256
,因此无法将其重塑为
28*28
。将此行更改为:

network = tflearn.reshape(network, [-1, 1, 256])

请注意,您正在将CNN生成的功能,而不是输入MNIST图像馈送到LSTM。

为什么要将最后一个退出层重塑为
[-1,1,28*28]
?输入到LSTM需要3D张力,或者您误解了错误消息。该输出形状与输入形状不兼容。注意28*28=784。那么正确的形状应该是什么?我是新来的