Python 从瓶颈特性训练密集层与冻结除最后一层之外的所有层应该是相同的,但它们的行为不同

Python 从瓶颈特性训练密集层与冻结除最后一层之外的所有层应该是相同的,但它们的行为不同,python,neural-network,deep-learning,keras,keras-layer,Python,Neural Network,Deep Learning,Keras,Keras Layer,作为一种“理智检查”,我尝试了两种使用迁移学习的方法,我希望这些方法在运行时至少在结果中表现相同 第一种方法是使用瓶颈特性(如这里解释的https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html),即使用现有预测器在最后一个密集层之前生成特征,并保存它们,然后用这些特征作为输入训练一个新的密集层 第二种方法是将模型的最后一个密集层替换为新的密集层,然后冻结模型中的所

作为一种“理智检查”,我尝试了两种使用迁移学习的方法,我希望这些方法在运行时至少在结果中表现相同

第一种方法是使用瓶颈特性(如这里解释的
https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
),即使用现有预测器在最后一个密集层之前生成特征,并保存它们,然后用这些特征作为输入训练一个新的密集层

第二种方法是将模型的最后一个密集层替换为新的密集层,然后冻结模型中的所有其他层

我原以为第二种方法和第一种方法一样有效,但事实并非如此

第一种方法的输出是

 Epoch 1/50
16/16 [==============================] - 0s - loss: 1.3095 - acc: 0.4375 - val_loss: 0.4533 - val_acc: 0.7500
Epoch 2/50
16/16 [==============================] - 0s - loss: 0.3555 - acc: 0.8125 - val_loss: 0.2305 - val_acc: 1.0000
Epoch 3/50
16/16 [==============================] - 0s - loss: 0.1365 - acc: 1.0000 - val_loss: 0.1603 - val_acc: 1.0000
Epoch 4/50
16/16 [==============================] - 0s - loss: 0.0600 - acc: 1.0000 - val_loss: 0.1012 - val_acc: 1.0000
Epoch 5/50
16/16 [==============================] - 0s - loss: 0.0296 - acc: 1.0000 - val_loss: 0.0681 - val_acc: 1.0000
Epoch 6/50
16/16 [==============================] - 0s - loss: 0.0165 - acc: 1.0000 - val_loss: 0.0521 - val_acc: 1.0000
Epoch 7/50
16/16 [==============================] - 0s - loss: 0.0082 - acc: 1.0000 - val_loss: 0.0321 - val_acc: 1.0000
Epoch 8/50
16/16 [==============================] - 0s - loss: 0.0036 - acc: 1.0000 - val_loss: 0.0222 - val_acc: 1.0000
Epoch 9/50
16/16 [==============================] - 0s - loss: 0.0023 - acc: 1.0000 - val_loss: 0.0185 - val_acc: 1.0000
Epoch 10/50
16/16 [==============================] - 0s - loss: 0.0011 - acc: 1.0000 - val_loss: 0.0108 - val_acc: 1.0000
Epoch 11/50
16/16 [==============================] - 0s - loss: 5.6636e-04 - acc: 1.0000 - val_loss: 0.0087 - val_acc: 1.0000
Epoch 12/50
16/16 [==============================] - 0s - loss: 2.9463e-04 - acc: 1.0000 - val_loss: 0.0094 - val_acc: 1.0000
Epoch 13/50
16/16 [==============================] - 0s - loss: 1.5169e-04 - acc: 1.0000 - val_loss: 0.0072 - val_acc: 1.0000
Epoch 14/50
16/16 [==============================] - 0s - loss: 7.4001e-05 - acc: 1.0000 - val_loss: 0.0039 - val_acc: 1.0000
Epoch 15/50
16/16 [==============================] - 0s - loss: 3.9956e-05 - acc: 1.0000 - val_loss: 0.0034 - val_acc: 1.0000
Epoch 16/50
16/16 [==============================] - 0s - loss: 2.0384e-05 - acc: 1.0000 - val_loss: 0.0024 - val_acc: 1.0000
Epoch 17/50
16/16 [==============================] - 0s - loss: 1.0036e-05 - acc: 1.0000 - val_loss: 0.0026 - val_acc: 1.0000
Epoch 18/50
16/16 [==============================] - 0s - loss: 5.0962e-06 - acc: 1.0000 - val_loss: 0.0010 - val_acc: 1.0000
Epoch 19/50
16/16 [==============================] - 0s - loss: 2.7791e-06 - acc: 1.0000 - val_loss: 0.0011 - val_acc: 1.0000
Epoch 20/50
16/16 [==============================] - 0s - loss: 1.5646e-06 - acc: 1.0000 - val_loss: 0.0015 - val_acc: 1.0000
Epoch 21/50
16/16 [==============================] - 0s - loss: 8.6427e-07 - acc: 1.0000 - val_loss: 9.0825e-04 - val_acc: 1.0000
Epoch 22/50
16/16 [==============================] - 0s - loss: 4.3958e-07 - acc: 1.0000 - val_loss: 5.6370e-04 - val_acc: 1.0000
Epoch 23/50
16/16 [==============================] - 0s - loss: 2.5332e-07 - acc: 1.0000 - val_loss: 5.1226e-04 - val_acc: 1.0000
Epoch 24/50
16/16 [==============================] - 0s - loss: 1.6391e-07 - acc: 1.0000 - val_loss: 6.6560e-04 - val_acc: 1.0000
Epoch 25/50
16/16 [==============================] - 0s - loss: 1.3411e-07 - acc: 1.0000 - val_loss: 6.5456e-04 - val_acc: 1.0000
Epoch 26/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 27/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 28/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 29/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 30/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
它收敛速度快,结果好

另一方面,第二种方法给出:

Epoch 1/50
24/24 [==============================] - 63s - loss: 0.7375 - acc: 0.7500 - val_loss: 0.7575 - val_acc: 0.6667
Epoch 2/50
24/24 [==============================] - 61s - loss: 0.6763 - acc: 0.7500 - val_loss: 1.5228 - val_acc: 0.5000
Epoch 3/50
24/24 [==============================] - 61s - loss: 0.7149 - acc: 0.7500 - val_loss: 3.5805 - val_acc: 0.3333
Epoch 4/50
24/24 [==============================] - 61s - loss: 0.6363 - acc: 0.7500 - val_loss: 1.5066 - val_acc: 0.5000
Epoch 5/50
24/24 [==============================] - 61s - loss: 0.6542 - acc: 0.7500 - val_loss: 1.8745 - val_acc: 0.6667
Epoch 6/50
24/24 [==============================] - 61s - loss: 0.7007 - acc: 0.7500 - val_loss: 1.5328 - val_acc: 0.5000
Epoch 7/50
24/24 [==============================] - 61s - loss: 0.6900 - acc: 0.7500 - val_loss: 3.6004 - val_acc: 0.3333
Epoch 8/50
24/24 [==============================] - 61s - loss: 0.6615 - acc: 0.7500 - val_loss: 1.5734 - val_acc: 0.5000
Epoch 9/50
24/24 [==============================] - 61s - loss: 0.6571 - acc: 0.7500 - val_loss: 3.0078 - val_acc: 0.6667
Epoch 10/50
24/24 [==============================] - 61s - loss: 0.5762 - acc: 0.7083 - val_loss: 3.6029 - val_acc: 0.5000
Epoch 11/50
24/24 [==============================] - 61s - loss: 0.6515 - acc: 0.7500 - val_loss: 5.8610 - val_acc: 0.3333
Epoch 12/50
24/24 [==============================] - 61s - loss: 0.6541 - acc: 0.7083 - val_loss: 2.4551 - val_acc: 0.5000
Epoch 13/50
24/24 [==============================] - 61s - loss: 0.6700 - acc: 0.7500 - val_loss: 2.9983 - val_acc: 0.6667
Epoch 14/50
24/24 [==============================] - 61s - loss: 0.6486 - acc: 0.7500 - val_loss: 3.6179 - val_acc: 0.5000
Epoch 15/50
24/24 [==============================] - 61s - loss: 0.6985 - acc: 0.6667 - val_loss: 5.8419 - val_acc: 0.3333
Epoch 16/50
24/24 [==============================] - 62s - loss: 0.6465 - acc: 0.7083 - val_loss: 2.5201 - val_acc: 0.5000
Epoch 17/50
24/24 [==============================] - 62s - loss: 0.6246 - acc: 0.7500 - val_loss: 2.9912 - val_acc: 0.6667
Epoch 18/50
24/24 [==============================] - 62s - loss: 0.6768 - acc: 0.7500 - val_loss: 3.6320 - val_acc: 0.5000
Epoch 19/50
24/24 [==============================] - 62s - loss: 0.5774 - acc: 0.7083 - val_loss: 5.8575 - val_acc: 0.3333
Epoch 20/50
24/24 [==============================] - 62s - loss: 0.6642 - acc: 0.7500 - val_loss: 2.5865 - val_acc: 0.5000
Epoch 21/50
24/24 [==============================] - 63s - loss: 0.6553 - acc: 0.7083 - val_loss: 2.9967 - val_acc: 0.6667
Epoch 22/50
24/24 [==============================] - 62s - loss: 0.6469 - acc: 0.7083 - val_loss: 3.6233 - val_acc: 0.5000
Epoch 23/50
24/24 [==============================] - 64s - loss: 0.6029 - acc: 0.7500 - val_loss: 5.8225 - val_acc: 0.3333
Epoch 24/50
24/24 [==============================] - 63s - loss: 0.6183 - acc: 0.7083 - val_loss: 2.5325 - val_acc: 0.5000
Epoch 25/50
24/24 [==============================] - 62s - loss: 0.6631 - acc: 0.7500 - val_loss: 2.9879 - val_acc: 0.6667
Epoch 26/50
24/24 [==============================] - 63s - loss: 0.6082 - acc: 0.7500 - val_loss: 3.6206 - val_acc: 0.5000
Epoch 27/50
24/24 [==============================] - 62s - loss: 0.6536 - acc: 0.7500 - val_loss: 5.7937 - val_acc: 0.3333
Epoch 28/50
24/24 [==============================] - 63s - loss: 0.5853 - acc: 0.7500 - val_loss: 2.6138 - val_acc: 0.5000
Epoch 29/50
24/24 [==============================] - 62s - loss: 0.5523 - acc: 0.7500 - val_loss: 3.0126 - val_acc: 0.6667
Epoch 30/50
24/24 [==============================] - 62s - loss: 0.7112 - acc: 0.7500 - val_loss: 3.7054 - val_acc: 0.5000
两种方法使用相同的模型(Inception V4)。 我的代码如下:

第一种方法(瓶颈特性):

第二种方法(除最后一种外全部冻结)

我认为冻结所有的网络但顶部和只训练顶部应该与使用所有的网络但顶部来创建顶部之前存在的特征相同,然后训练一个新的密集层基本上是一样的

因此,我的代码或对问题的思考(或两者)都不正确

我做错了什么


谢谢您的时间。

这是一个非常巧妙的问题。这是因为第二种方法中的
退出
层。即使层被设置为不可训练-
退出
仍然有效,并通过更改输入防止网络过度拟合

尝试将代码更改为:

v4 = inception_v4.create_model(weights='imagenet')
predictions = Flatten()(v4.layers[-4].output)
predictions = Dense(output_dim=num_classes, activation='softmax', name="newDense")(predictions)
另外-由于
BatchNormalization
batch\u大小更改为
24


这应该行得通。

在瓶颈情况下,您能打印出
model.predict()的结果吗?我想您的意思是我应该在
save\u BN
块中添加它。这给出了
TypeError:predict()至少接受2个参数(1个给定)
我得到了错误
回溯(最后一次调用):文件“re_ask.py”,第77行,在train_top_模型(num_classes)文件“re_ask.py”,第35行,在train_top_模型预测=展平(v4.layers[-4]。输出)TypeError:u init_u()只接受1个参数(给定2个)
谢谢。新代码运行,但它没有改进训练
Epoch 1/50损失:0.5360-acc:0.7917-val_损失:0.8624-val_损失:0.6667 Epoch 2/50损失:0.6320-acc:0.7500-val_损失:1.6275-val_损失:0.5000。纪元22/50损失:0.5968-acc:0.7500-val_损失:3.7298-val_损失:0.5000纪元23/50损失:0.5971-acc:0.7500-val_损失:5.8331-val_损失:0.3333
您能打印
v4.layers[-n]
用于
n=1,2,3,4,5
?代码:`用于范围内的索引(1,6):打印v4.layers[-index]`输出:“”通过将批次大小设置为24进行检查
from keras import backend as K
import inception_v4
import numpy as np
import cv2
import os

from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers import Activation, Dropout, Flatten, Dense, Input

from keras.models import Model
os.environ['CUDA_VISIBLE_DEVICES'] = ''


my_batch_size=1


train_data_dir ='//shared_directory/projects/try_CDxx/data/train/'
validation_data_dir ='//shared_directory/projects/try_CDxx/data/validation/'
top_model_path= 'tm_trained_model.h5'

img_width, img_height = 299, 299
num_classes=2
#nb_epoch=50
nb_epoch=50
nbr_train_samples = 24
nbr_validation_samples = 12


def train_top_model (num_classes):

    v4 = inception_v4.create_model(weights='imagenet')
    predictions = Dense(output_dim=num_classes, activation='softmax', name="newDense")(v4.layers[-2].output) # replacing the 1001 categories dense layer with my own 
    main_input= v4.layers[1].input
    main_output=predictions
    t_model = Model(input=[main_input], output=[main_output])


    val_datagen = ImageDataGenerator(rescale=1./255)
    train_datagen  = ImageDataGenerator(rescale=1./255)  


    train_generator = train_datagen.flow_from_directory(
            train_data_dir,
            target_size = (img_width, img_height),
            batch_size = my_batch_size,
            shuffle = False,
            class_mode = 'categorical')

    validation_generator = val_datagen.flow_from_directory(
            validation_data_dir,
            target_size=(img_width, img_height),
            batch_size=my_batch_size,
            shuffle = False,
            class_mode = 'categorical') 
#
    for layer in t_model.layers:
        layer.trainable = False
    t_model.layers[-1].trainable=True
    t_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])


#
    t_model.fit_generator(
            train_generator,
            samples_per_epoch = nbr_train_samples,
            nb_epoch = nb_epoch,
            validation_data = validation_generator,
            nb_val_samples = nbr_validation_samples)
    t_model.save(top_model_path)    

#   print (t_model.trainable_weights)

train_top_model(num_classes)
v4 = inception_v4.create_model(weights='imagenet')
predictions = Flatten()(v4.layers[-4].output)
predictions = Dense(output_dim=num_classes, activation='softmax', name="newDense")(predictions)