Computer vision VGG16模型的输出特征映射维度_Computer Vision_Conv Neural Network_Feature Extraction

Computer vision VGG16模型的输出特征映射维度

computer-vision

Computer vision VGG16模型的输出特征映射维度,computer-vision,conv-neural-network,feature-extraction,Computer Vision,Conv Neural Network,Feature Extraction,我在中看到了特征提取的示例，并使用以下代码从输入图像中提取特征 input_shape = (224, 224, 3) model = VGG16(weights = 'imagenet', input_shape = (input_shape[0], input_shape[1], input_shape[2]), pooling = 'max', include_top = False) img = image.load_img(img_path, target_size=(input_s

我在中看到了特征提取的示例，并使用以下代码从输入图像中提取特征

input_shape = (224, 224, 3)
model = VGG16(weights = 'imagenet', input_shape = (input_shape[0], 
input_shape[1], input_shape[2]), pooling = 'max', include_top = False)
img = image.load_img(img_path, target_size=(input_shape[0], 
input_shape[1]))
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
feature =  model.predict(img)

然后，当我输出

特性

变量的形状时，我发现它是（1512）。为什么是这个维度？

print model.summary（）

显示了maxpooling is（7,7,512）之后最后一个conv层的输出的形状，这是我期望的

功能

应该是的维度

谢谢你帮我解决这个问题。因为他在回答问题时有些问题，所以我把他的答案放在这里，以防其他人也有同样的问题

基本上，这是因为在这个模型中指定了一个全局最大池层（正如我们在

model=VGG16（…，pooling='max'，…）

行中所看到的那样），它从7*7个单元格中选择最大的单元格。在keras中也有这样的说法：

在

model.summary（）

给出的输出中，我们可以看到在第五个卷积块的最大池之后，实际上有一个

global\u max\u pooling2d\u 1

层，因此最终的维度变为512

 pooling: Optional pooling mode for feature extraction when include_top is False.