Python 为什么神经网络输出不可预测的形状?

Python 为什么神经网络输出不可预测的形状?,python,keras,output,flatten,Python,Keras,Output,Flatten,我将AlexNet体系结构调整为1D数据,如下所示: import keras from keras.models import Sequential from keras.layers import Dense, Activation, Dropout, Flatten, Conv1D, MaxPooling1D from keras.layers.normalization import BatchNormalization import numpy as np np.random.seed

我将AlexNet体系结构调整为1D数据,如下所示:

import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, Conv1D, MaxPooling1D
from keras.layers.normalization import BatchNormalization
import numpy as np
np.random.seed(1000)
#Instantiate an empty model
model = Sequential()

# 1st Convolutional Layer
model.add(layers.Embedding(vocab_size, embedding_dim, input_length=input_length))
model.add(Conv1D(filters=96, kernel_size=11, strides=4, padding='same'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))
# model.add(Flatten())

# 2nd Convolutional Layer
model.add(Conv1D(filters=256, kernel_size=11, strides=1, padding='same'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))
# model.add(Flatten())

# 3rd Convolutional Layer
model.add(Conv1D(filters=384, kernel_size=3, strides=1, padding='same'))
model.add(Activation('relu'))

# 4th Convolutional Layer
model.add(Conv1D(filters=384, kernel_size=3, strides=1, padding='same'))
model.add(Activation('relu'))

# 5th Convolutional Layer
model.add(Conv1D(filters=256, kernel_size=3, strides=1, padding='same'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))

# Passing it to a Fully Connected layer
model.add(Flatten())
# 1st Fully Connected Layer
model.add(Dense(4096, input_shape=(10,39424)))
model.add(Activation('relu'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))

# 2nd Fully Connected Layer
model.add(Dense(4096))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))

# 3rd Fully Connected Layer
model.add(Dense(1000))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))

# Output Layer
model.add(Dense(8))
model.add(Activation('softmax'))

model.summary()

# Compile the model
model.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])  
Model: "sequential_15"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_13 (Embedding)     (None, 4924, 128)         6400      
_________________________________________________________________
conv1d_40 (Conv1D)           (None, 1231, 96)          135264    
_________________________________________________________________
activation_47 (Activation)   (None, 1231, 96)          0         
_________________________________________________________________
max_pooling1d_25 (MaxPooling (None, 616, 96)           0         
_________________________________________________________________
conv1d_41 (Conv1D)           (None, 616, 256)          270592    
_________________________________________________________________
activation_48 (Activation)   (None, 616, 256)          0         
_________________________________________________________________
max_pooling1d_26 (MaxPooling (None, 308, 256)          0         
_________________________________________________________________
conv1d_42 (Conv1D)           (None, 308, 384)          295296    
_________________________________________________________________
activation_49 (Activation)   (None, 308, 384)          0         
_________________________________________________________________
conv1d_43 (Conv1D)           (None, 308, 384)          442752    
_________________________________________________________________
activation_50 (Activation)   (None, 308, 384)          0         
_________________________________________________________________
conv1d_44 (Conv1D)           (None, 308, 256)          295168    
_________________________________________________________________
activation_51 (Activation)   (None, 308, 256)          0         
_________________________________________________________________
max_pooling1d_27 (MaxPooling (None, 154, 256)          0         
_________________________________________________________________
flatten_13 (Flatten)         (None, 39424)             0         
_________________________________________________________________
dense_30 (Dense)             (None, 4096)              161484800 
_________________________________________________________________
activation_52 (Activation)   (None, 4096)              0         
_________________________________________________________________
dropout_18 (Dropout)         (None, 4096)              0         
_________________________________________________________________
dense_31 (Dense)             (None, 4096)              16781312  
_________________________________________________________________
activation_53 (Activation)   (None, 4096)              0         
_________________________________________________________________
dropout_19 (Dropout)         (None, 4096)              0         
_________________________________________________________________
dense_32 (Dense)             (None, 1000)              4097000   
_________________________________________________________________
activation_54 (Activation)   (None, 1000)              0         
_________________________________________________________________
dropout_20 (Dropout)         (None, 1000)              0         
_________________________________________________________________
dense_33 (Dense)             (None, 8)                 8008      
_________________________________________________________________
activation_55 (Activation)   (None, 8)                 0         
=================================================================
Total params: 183,816,592
Trainable params: 183,816,592
Non-trainable params: 0
但是,我得到了以下错误:

ValueError: Input 0 of layer dense_30 is incompatible with the layer: expected axis -1 of input shape 
to have value 39424 but received input with shape [10, 23552]
我不明白为什么展平层的输出为[1023552]。。应该是39424!! 问题是第一个密集层从展平层接收到不可预测的数据形状!! 我认为展平层必须输出一维数据,而不是像它那样输出二维数据([1023552]) 模型摘要如下:

import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, Conv1D, MaxPooling1D
from keras.layers.normalization import BatchNormalization
import numpy as np
np.random.seed(1000)
#Instantiate an empty model
model = Sequential()

# 1st Convolutional Layer
model.add(layers.Embedding(vocab_size, embedding_dim, input_length=input_length))
model.add(Conv1D(filters=96, kernel_size=11, strides=4, padding='same'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))
# model.add(Flatten())

# 2nd Convolutional Layer
model.add(Conv1D(filters=256, kernel_size=11, strides=1, padding='same'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))
# model.add(Flatten())

# 3rd Convolutional Layer
model.add(Conv1D(filters=384, kernel_size=3, strides=1, padding='same'))
model.add(Activation('relu'))

# 4th Convolutional Layer
model.add(Conv1D(filters=384, kernel_size=3, strides=1, padding='same'))
model.add(Activation('relu'))

# 5th Convolutional Layer
model.add(Conv1D(filters=256, kernel_size=3, strides=1, padding='same'))
model.add(Activation('relu'))
# Max Pooling
model.add(MaxPooling1D(pool_size=2, strides=2, padding='same'))

# Passing it to a Fully Connected layer
model.add(Flatten())
# 1st Fully Connected Layer
model.add(Dense(4096, input_shape=(10,39424)))
model.add(Activation('relu'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))

# 2nd Fully Connected Layer
model.add(Dense(4096))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))

# 3rd Fully Connected Layer
model.add(Dense(1000))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))

# Output Layer
model.add(Dense(8))
model.add(Activation('softmax'))

model.summary()

# Compile the model
model.compile(loss=keras.losses.categorical_crossentropy, optimizer='adam', metrics=['accuracy'])  
Model: "sequential_15"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_13 (Embedding)     (None, 4924, 128)         6400      
_________________________________________________________________
conv1d_40 (Conv1D)           (None, 1231, 96)          135264    
_________________________________________________________________
activation_47 (Activation)   (None, 1231, 96)          0         
_________________________________________________________________
max_pooling1d_25 (MaxPooling (None, 616, 96)           0         
_________________________________________________________________
conv1d_41 (Conv1D)           (None, 616, 256)          270592    
_________________________________________________________________
activation_48 (Activation)   (None, 616, 256)          0         
_________________________________________________________________
max_pooling1d_26 (MaxPooling (None, 308, 256)          0         
_________________________________________________________________
conv1d_42 (Conv1D)           (None, 308, 384)          295296    
_________________________________________________________________
activation_49 (Activation)   (None, 308, 384)          0         
_________________________________________________________________
conv1d_43 (Conv1D)           (None, 308, 384)          442752    
_________________________________________________________________
activation_50 (Activation)   (None, 308, 384)          0         
_________________________________________________________________
conv1d_44 (Conv1D)           (None, 308, 256)          295168    
_________________________________________________________________
activation_51 (Activation)   (None, 308, 256)          0         
_________________________________________________________________
max_pooling1d_27 (MaxPooling (None, 154, 256)          0         
_________________________________________________________________
flatten_13 (Flatten)         (None, 39424)             0         
_________________________________________________________________
dense_30 (Dense)             (None, 4096)              161484800 
_________________________________________________________________
activation_52 (Activation)   (None, 4096)              0         
_________________________________________________________________
dropout_18 (Dropout)         (None, 4096)              0         
_________________________________________________________________
dense_31 (Dense)             (None, 4096)              16781312  
_________________________________________________________________
activation_53 (Activation)   (None, 4096)              0         
_________________________________________________________________
dropout_19 (Dropout)         (None, 4096)              0         
_________________________________________________________________
dense_32 (Dense)             (None, 1000)              4097000   
_________________________________________________________________
activation_54 (Activation)   (None, 1000)              0         
_________________________________________________________________
dropout_20 (Dropout)         (None, 1000)              0         
_________________________________________________________________
dense_33 (Dense)             (None, 8)                 8008      
_________________________________________________________________
activation_55 (Activation)   (None, 8)                 0         
=================================================================
Total params: 183,816,592
Trainable params: 183,816,592
Non-trainable params: 0

有什么线索吗?

我想错误是在训练中出现的,对吗?你输入的是什么形状?网络本身看起来不错,您不需要在展平层之后的第一个密集层中指定
,输入_shape=(1039424)
,但这不应该是一个问题。是的,它在第一个历元之后出现。第一个密集层的输入形状应为39424(根据模型摘要),因为它来自扁平层。但是错误表明它接收的输入是[1023552],这太令人困惑了,为什么你要给中间层的一个输入形状?model.add(Dense(4096,input_shape=(1039424)))input shape只提供给网络的第一层,当然这会产生错误消息。是的,这是一个错误,我不应该为该层定义input shape。然而,错误源于一个打字错误,它改变了训练集和测试集的输入形状。当我纠正错误时,代码运行良好。