Tensorflow 无法解释迁移学习中的模型行为_Tensorflow_Keras_Deep Learning

Tensorflow 无法解释迁移学习中的模型行为

tensorflow keras deep-learning

Tensorflow 无法解释迁移学习中的模型行为,tensorflow,keras,deep-learning,Tensorflow,Keras,Deep Learning,我有一个非常小的图像数据集（747个图像用于训练，250个图像用于测试，其中图像大小调整为256 x 256）。任务是多标签分类（两种感染之间可能同时发生，但我的训练数据没有这种情况）由于我的数据集非常小，我决定使用VGG16和InceptionV3进行迁移学习。当我训练VGG16时，一切都遵循理论，例如训练损失和验证损失不断减少，并且没有太大的差异，如图1所示当我训练InceptionV3时，似乎模型过于合适，但我不确定这一点，因为训练损失约为0.6，而价值损失约为训练损失的10倍，如图

我有一个非常小的图像数据集（747个图像用于训练，250个图像用于测试，其中图像大小调整为256 x 256）。任务是多标签分类（两种感染之间可能同时发生，但我的训练数据没有这种情况）

由于我的数据集非常小，我决定使用VGG16和InceptionV3进行迁移学习。当我训练VGG16时，一切都遵循理论，例如训练损失和验证损失不断减少，并且没有太大的差异，如图1所示

当我训练InceptionV3时，似乎模型过于合适，但我不确定这一点，因为训练损失约为0.6，而价值损失约为训练损失的10倍，如图2所示

两个模型都添加了3个致密层。我附上代码以供参考。我找不到解释，为什么更大的模型（VGG）不适合这个数据集，而InceptionV3却适合这个数据集。我能建议一下Inception v3出了什么问题吗

def xvgg16(self, height, width, depth, num_class, hparams):
        """
        This function defines transfer learning for vgg16

        Parameters
        ----------
        height : Integer
            Image height (pixel)
        width : Integer
            Image width (pixel)
        depth : Integer
            Image channel
        num_class : Integer
            Number of class labels
        hparams: Dictionary
            Hyperparameters

        Returns
        -------
        model : Keras model object
            The transfer model

        """
        input_tensor = Input(shape=(height, width, depth))
        pretrain = VGG16(weights="imagenet", include_top=False, input_tensor=input_tensor)

        conv1_1 = pretrain.layers[1]
        conv1_2 = pretrain.layers[2]
        pool1 = pretrain.layers[3]
        conv2_1 = pretrain.layers[4]
        conv2_2 = pretrain.layers[5]
        pool2 = pretrain.layers[6]
        conv3_1 = pretrain.layers[7]
        conv3_2 = pretrain.layers[8]
        conv3_3 = pretrain.layers[9]
        pool3 = pretrain.layers[10]
        conv4_1 = pretrain.layers[11]
        conv4_2 = pretrain.layers[12]
        conv4_3 = pretrain.layers[13]
        pool4 = pretrain.layers[14]
        conv5_1 = pretrain.layers[15]
        conv5_2 = pretrain.layers[16]
        conv5_3 = pretrain.layers[17]
        pool5 = pretrain.layers[18]

        x = BatchNormalization(axis=-1)(conv1_1.output)
        x = conv1_2(x)
        x = BatchNormalization(axis=-1)(x)
        x = pool1(x)
        x = conv2_1(x)
        x = BatchNormalization(axis=-1)(x)
        x = conv2_2(x)
        x = BatchNormalization(axis=-1)(x)
        x = pool2(x)
        x = conv3_1(x)
        x = BatchNormalization(axis=-1)(x)
        x = conv3_2(x)
        x = BatchNormalization(axis=-1)(x)
        x = conv3_3(x)
        x = BatchNormalization(axis=-1)(x)
        x = pool3(x)
        x = conv4_1(x)
        x = BatchNormalization(axis=-1)(x)
        x = conv4_2(x)
        x = BatchNormalization(axis=-1)(x)
        x = conv4_3(x)
        x = BatchNormalization(axis=-1)(x)
        x = pool4(x)
        x = conv5_1(x)
        x = BatchNormalization(axis=-1)(x)
        x = conv5_2(x)
        x = BatchNormalization(axis=-1)(x)
        x = conv5_3(x)
        x = BatchNormalization(axis=-1)(x)
        x = pool5(x)

        x = Flatten()(x)
        x = Dense(64, use_bias=False)(x)
        x = Dropout(0.25)(x)
        x = BatchNormalization(axis=-1)(x)
        x = Activation("relu")(x)

        x = Dense(64, use_bias=False)(x)
        x = Dropout(0.25)(x)
        x = BatchNormalization(axis=-1)(x)
        x = Activation("relu")(x)

        x = Dense(64, use_bias=False)(x)
        x = Dropout(0.25)(x)
        x = BatchNormalization(axis=-1)(x)
        x = Activation("relu")(x)

        x = Dense(num_class)(x)
        x = Activation("sigmoid")(x)

        model = Model(inputs=pretrain.layers[0].input, outputs=x)

        for layer in model.layers:
            if "conv" in layer.name:
                layer.trainable = False    

        model.compile(loss="binary_crossentropy", optimizer=Adam(lr=hparams["learning_rate"]), metrics=["binary_accuracy"])

        return model

def inception3(self, height, width, depth, num_class, hparams):
        """
        This function defines transfer learning for densenet

        Parameters
        ----------
        height : Integer
            Image height (pixel)
        width : Integer
            Image width (pixel)
        depth : Integer
            Image channel
        num_class : Integer
            Number of class labels
        hparams: Dictionary
            Hyperparameters

        Returns
        -------
        model : Keras model object
            The transfer model
        """
        input_tensor = Input(shape=(height, width, depth))
        pretrain = InceptionV3(weights="imagenet", include_top=False, input_tensor=input_tensor)

        x = pretrain.output
        x = GlobalAveragePooling2D()(x)

        x = Dense(64, use_bias=False)(x)
        x = Dropout(0.25)(x)
        x = BatchNormalization(axis=-1)(x)
        x = Activation("relu")(x)

        x = Dense(64, use_bias=False)(x)
        x = Dropout(0.25)(x)
        x = BatchNormalization(axis=-1)(x)
        x = Activation("relu")(x)

        x = Dense(64, use_bias=False)(x)
        x = Dropout(0.25)(x)
        x = BatchNormalization(axis=-1)(x)
        x = Activation("relu")(x)

        x = Dense(num_class)(x)
        x = Activation("sigmoid")(x)

        model = Model(inputs=pretrain.input, outputs=x)

        for layer in pretrain.layers:
            layer.trainable = False

        model.compile(loss="binary_crossentropy", optimizer=Adam(lr=hparams["learning_rate"]), metrics=["binary_accuracy"])

        return model

您应该知道，

Keras

中的

VGG

和

Inception

模型都是使用

imagenet

进行预训练的，但预处理功能不同

当

VGG

对图像进行预处理以使其像素值在

（0255）

范围内时，

Inception\u v3

对图像进行预处理以使其像素值在

（-1，1）

因此，当您培训VGG时，您应该首先按如下方式预处理输入图像：

from keras.applications.vgg16 import preprocess_input
X_train = ... # read your training images
X_train = preprocess_input(X_train)
print(X_train.max(), X_train.min(), X_train.mean())

您将看到最大、最小和平均像素值在

（0255）

对于Inception_v3，您应该遵循以下步骤：

from keras.applications.inception_v3 import preprocess_input
X_train = ... # read your training images
X_train = preprocess_input(X_train)
print(X_train.max(), X_train.min(), X_train.mean())

这里，值将介于

-1

和

在您当前的代码中，VGG工作得很好，因为您的图像的像素范围在VGG模型所期望的0到255之间，但不适用于Inception V3，因为它期望像素范围在-1和1之间

希望这有帮助。

太好了：）