Python 我自己的FastRCNN实现无法在平衡数据上运行良好_Python_Tensorflow_Keras_Faster Rcnn

Python 我自己的FastRCNN实现无法在平衡数据上运行良好

python tensorflow keras

Python 我自己的FastRCNN实现无法在平衡数据上运行良好,python,tensorflow,keras,faster-rcnn,Python,Tensorflow,Keras,Faster Rcnn,2020.06.09 共有700幅图像用于训练，每幅图像提取64个ROI并进行一个小批量，当批量大小设置为2时，需要花费350个步骤来完成训练，但对于RCNN，每个目标提取为单个图像，大小调整为224*224，将有64*700=44800幅图像，每一个都包含比7*7合并特征图更多的信息和特征，我想这就是为什么它看起来不太合适，尽管RCNN可以在相同的数据上很好地训练 =================================================================

2020.06.09

共有700幅图像用于训练，每幅图像提取64个ROI并进行一个小批量，当批量大小设置为2时，需要花费350个步骤来完成训练，但对于RCNN，每个目标提取为单个图像，大小调整为224*224，将有64*700=44800幅图像，每一个都包含比7*7合并特征图更多的信息和特征，我想这就是为什么它看起来不太合适，尽管RCNN可以在相同的数据上很好地训练

==========================================================================

使用完全平衡的数据，acc降至0.53（训练数据）

我认为这个网络只是猜测而不是学习

==========================================================================

2020.06.08

我遵循GitHub中许多回购协议中使用的这种结构，但acc不会改进：

def build_model():
    pooled_square_size = 7
    num_rois = 32
    roi_input = Input(shape=(num_rois, 4), name="input_2")
    model_cnn = tf.keras.applications.VGG16(
        include_top=True,
        weights='imagenet'
    )
    x = model_cnn.layers[17].output
    x = RoiPoolingConv(pooled_square_size, roi_input.shape[1])([x, roi_input])
    x = TimeDistributed(Flatten())(x)
    x = TimeDistributed(Dense(4096, activation='selu'))(x)
    x = TimeDistributed(Dropout(0.5))(x)
    x = TimeDistributed(Dense(4096, activation='selu'))(x)
    x = TimeDistributed(Dropout(0.5))(x)
    x = TimeDistributed(Dense(2, activation='softmax', kernel_initializer='zero'))(x)
    model_final = Model(inputs=[model_cnn.input, roi_input], outputs=x)
    opt = Adam(lr=0.0001)
    model_final.compile(
        loss=tf.keras.losses.CategoricalCrossentropy(),
        optimizer=opt,
        metrics=["accuracy"]
    )
    model_final.save("TrainedModels" + slash + "FastRCNN.h5")

培训日志：

100/100 [==============================] - ETA: 0s - loss: 0.5556 - accuracy: 0.7681
Epoch 00001: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 41s 412ms/step - loss: 0.5556 - accuracy: 0.7681
Epoch 2/100
100/100 [==============================] - ETA: 0s - loss: 0.5223 - accuracy: 0.7910
Epoch 00002: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 41s 414ms/step - loss: 0.5223 - accuracy: 0.7910
Epoch 3/100
100/100 [==============================] - ETA: 0s - loss: 0.5340 - accuracy: 0.7797
Epoch 00003: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 42s 416ms/step - loss: 0.5340 - accuracy: 0.7797
Epoch 4/100
100/100 [==============================] - ETA: 0s - loss: 0.5309 - accuracy: 0.7825
Epoch 00004: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 43s 427ms/step - loss: 0.5309 - accuracy: 0.7825
Epoch 5/100
100/100 [==============================] - ETA: 0s - loss: 0.5257 - accuracy: 0.7840
Epoch 00005: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 43s 434ms/step - loss: 0.5257 - accuracy: 0.7840
Epoch 6/100
100/100 [==============================] - ETA: 0s - loss: 0.5181 - accuracy: 0.7928
Epoch 00006: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 42s 423ms/step - loss: 0.5181 - accuracy: 0.7928
Epoch 7/100
100/100 [==============================] - ETA: 0s - loss: 0.5483 - accuracy: 0.7712
Epoch 00007: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 42s 418ms/step - loss: 0.5483 - accuracy: 0.7712
Epoch 8/100
100/100 [==============================] - ETA: 0s - loss: 0.5282 - accuracy: 0.7832
Epoch 00008: saving model to TrainedModels\FastRCNN.h5
100/100 [==============================] - 43s 429ms/step - loss: 0.5282 - accuracy: 0.7832
Epoch 9/100
100/100 [==============================] - ETA: 0s - loss: 0.5385 - accuracy: 0.7765
Epoch 00009: saving model to TrainedModels\FastRCNN.h5

参考：

==========================================================================

我基于Fast RCNN为照片中的飞机检测编写了一个双分类模型，列车数据集是通过选择性搜索生成的，当我使用负/正比率约为1的数据集时，模型只能有约0.6 acc的列车数据集，当我使N/p比率更高并更接近于选择性搜索生成的原始比率时，列车acc可以达到0.9，但用于预测测试数据集时性能不佳。在训练过程中，历元完成后，列车acc始终保持不变，当我使用张力板时，我看到历元后各层的重量没有变化：

这是我的模型的基本结构，特征提取是VGG16，并将28*28的特征映射输出到ROI池层，我尝试将激活从ReLu更改为SeLu，但没有成功：

以下是（32*14*14*512）ROI池层前后的输入图像及其特征映射（28*28*512）：

我用这段代码生成了这个模型：

def build_model():
    num_rois = 32
    roi_input = Input(shape=(num_rois, 4), name="input_2")
    model_cnn = tf.keras.applications.VGG16(
        include_top=True,
        weights='imagenet'
    )
    x = model_cnn.layers[13].output
    x = RoiPoolingConv(pooled_square_size, roi_input.shape[1])([x, roi_input])
    for layer in model_cnn.layers[15:]:
        x = TimeDistributed(layer)(x)
    x = TimeDistributed(Dense(512, activation='sigmoid'))(x)
    x = TimeDistributed(Dense(2, activation='softmax'))(x)
    model_final = Model(inputs=[model_cnn.input, roi_input], outputs=x)
    opt = Adam(lr=0.0001)
    model_final.compile(
        loss=tf.keras.losses.BinaryCrossentropy(),
        optimizer=opt,
        metrics=["accuracy"]
    )
    model_final.save("TrainedModels" + slash + "FastRCNN.h5")

完整的代码可以在这里看到：

我尝试过添加BatchNormalization、调整LR，或者简单地添加更多层，但该模型一点也没有改进，我热切期待有人能告诉我该模型的关键缺陷，以便我可以对其进行进一步测试，谢谢

我高度怀疑这个VGG16有一些奇怪的地方：

这是一个输入图像：

这是其相应的输出特征映射

该死，现在我知道是什么问题了：

在ROI_Pooling.py中：

 def call(self, x, mask=None):
        assert (len(x) == 2)
        # x[0] is image with shape (rows, cols, channels)
        img = x[0]
        # x[1] is roi with shape (num_rois,4) with ordering (x1,y1,x2,y2)
        rois = x[1]

        input_shape = img.shape

        outputs = []

        x1 = rois[:, :, 0]
        y1 = rois[:, :, 1]
        x2 = rois[:, :, 2]
        y2 = rois[:, :, 3]

过去是：

def call(self, x, mask=None):
        assert (len(x) == 2)

        # x[0] is image with shape (rows, cols, channels)
        img = x[0]

        # x[1] is roi with shape (num_rois,4) with ordering (x,y,w,h)
        rois = x[1]

        input_shape = img.shape

        outputs = []

        for roi_idx in range(self.num_rois):
            x1 = rois[0, roi_idx, 0]
            y1 = rois[0, roi_idx, 1]
            x2 = rois[0, roi_idx, 2]
            y2 = rois[0, roi_idx, 3]

您可以清楚地看到，只有第一批roi用于生成结果

现在，结果有了很大改善：

该死，现在我知道是什么问题了：

在ROI_Pooling.py中：

 def call(self, x, mask=None):
        assert (len(x) == 2)
        # x[0] is image with shape (rows, cols, channels)
        img = x[0]
        # x[1] is roi with shape (num_rois,4) with ordering (x1,y1,x2,y2)
        rois = x[1]

        input_shape = img.shape

        outputs = []

        x1 = rois[:, :, 0]
        y1 = rois[:, :, 1]
        x2 = rois[:, :, 2]
        y2 = rois[:, :, 3]

过去是：

def call(self, x, mask=None):
        assert (len(x) == 2)

        # x[0] is image with shape (rows, cols, channels)
        img = x[0]

        # x[1] is roi with shape (num_rois,4) with ordering (x,y,w,h)
        rois = x[1]

        input_shape = img.shape

        outputs = []

        for roi_idx in range(self.num_rois):
            x1 = rois[0, roi_idx, 0]
            y1 = rois[0, roi_idx, 1]
            x2 = rois[0, roi_idx, 2]
            y2 = rois[0, roi_idx, 3]

您可以清楚地看到，只有第一批roi用于生成结果

现在，结果有了很大改善：