Python 暹罗网络未学习,损失减少但准确性未提高

Python 暹罗网络未学习,损失减少但准确性未提高,python,keras,deep-learning,conv-neural-network,siamese-network,Python,Keras,Deep Learning,Conv Neural Network,Siamese Network,我正试图建立一个暹罗网络来检查两条鲸鱼(个体)之间的相似性,我的数据集中有15k个图像,有5k个类,这就是为什么我选择暹罗,也不使用新类重新训练模型。这是神经网络体系结构: def contrastiveLoss(y_true, y_pred): margin = 1 return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin

我正试图建立一个暹罗网络来检查两条鲸鱼(个体)之间的相似性,我的数据集中有15k个图像,有5k个类,这就是为什么我选择暹罗,也不使用新类重新训练模型。这是神经网络体系结构:

        def contrastiveLoss(y_true, y_pred):
            margin = 1
            return K.mean(y_true * K.square(y_pred) + (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))

        def weightInit(shape,dtype=None):
            return rng.normal(loc = 0.0, scale = 1e-2, size = shape)

        def biasInit(shape,dtype=None):
            return rng.normal(loc = 0.5, scale = 1e-2, size = shape)

        input_shape = (200, 200, 1)
        left_input = Input(input_shape)
        right_input = Input(input_shape)

        model = Sequential()
        model.add(Conv2D(64,(10,10),activation='relu',input_shape=input_shape,kernel_initializer=weightInit,kernel_regularizer=l2(2e-4)))
        model.add(MaxPooling2D())
        model.add(Conv2D(128,(7,7),activation='relu',kernel_regularizer=l2(2e-4),kernel_initializer=weightInit,bias_initializer=biasInit))
        model.add(MaxPooling2D())
        model.add(Conv2D(128,(4,4),activation='relu',kernel_regularizer=l2(2e-4),kernel_initializer=weightInit,bias_initializer=biasInit))
        model.add(MaxPooling2D())
        model.add(Conv2D(256,(4,4),activation='relu',kernel_initializer=weightInit,kernel_regularizer=l2(2e-4),bias_initializer=biasInit))
        # model.add(BatchNormalization())
        # model.add(Dropout(0.25))
        model.add(Flatten())
        # model.add(Dropout(0.5))
        model.add(Dense(4096,activation="sigmoid",kernel_regularizer=l2(1e-3),kernel_initializer=weightInit,bias_initializer=biasInit))

        #call the convnet Sequential model on each of the input tensors so params will be shared
        encoded_l = model(left_input)
        encoded_r = model(right_input)
        #layer to merge two encoded inputs with the l1 distance between them
        L1_layer = Lambda(lambda tensors:K.abs(tensors[0] - tensors[1]))
        #call this layer on list of two input tensors.
        L1_distance = L1_layer([encoded_l, encoded_r])
        prediction = Dense(1,activation='sigmoid',bias_initializer=biasInit)(L1_distance)
        siamese_net = Model(inputs=[left_input,right_input],outputs=prediction)
        optimizer = Adam(0.001)
        siamese_net.compile(loss=contrastiveLoss,optimizer='RMSprop')
我试着这样运行它:

    #Training loop
evaluate_every = 10 # interval for evaluating on one-shot tasks
loss_every= 10 # interval for printing loss (iterations)
checkOverfitting = 20
batch_size = 32
epochs = 9000
N_way = 5 # how many classes for testing one-shot tasks
n_val = 250 #number of one-shot tasks to validate on
lossHistory = []
val_accHistory = []
print("training")
for i in range(1, epochs):
    (inputs,targets)=dataTrainLoader.getBatch(batch_size)
    (loss)=siamese_net.train_on_batch(inputs,targets)
    lossHistory.append(loss)
    if i % checkOverfitting == 0:
        print("Comparing results with x_train and x_test to check overfitting")
        val_acc_test = dataTrainLoader.test_oneshot(siamese_net,N_way,n_val,X_test,y_test,verbose=True)
        val_acc_train = dataTrainLoader.test_oneshot(siamese_net,N_way,n_val,X_train,y_train,verbose=True)
        print("Accuracy in train {:.2f}, accuracy in test {:.2f} ".format(val_acc_train,val_acc_test))
    elif i % evaluate_every == 0:
        val_acc = dataTrainLoader.test_oneshot(siamese_net,N_way,n_val,X_test,y_test,verbose=True)
        val_accHistory.append(val_acc)
    if i % loss_every == 0:
        print("iteration {}, loss: {:.2f}, val_acc: {:.2f}".format(i,loss,val_acc))
如果需要,我可以补充缺少的功能。DataTrainLoader是一个类,它为每个历元生成批(对),并创建对以测试N_way oneshot,返回正确预测的百分比

这是在减少损失,直到某些迭代开始后的某个点如此缓慢地减少,比如每两个阶段减少0.01。但准确度仍然停留在一个完整的过山车和一个非常糟糕的一个,测试图像和火车图像。我们可以从图中看到:

我添加了增强,并尝试了一些事情,比如改变学习率,改变架构,内核,损失函数为二进制交叉熵,批量大小,尝试添加正则化,但这似乎不是一个过度拟合,检查参数,通过肉眼对比图像和目标,似乎都是正确的。 我对如何改进这一点有点绝望。这些图像可以用眼睛分辨,这也不是一项不可能完成的任务。有什么提示吗,特威克斯,我错过了什么? 提前谢谢你的帮助