添加层会停止学习Keras

添加层会停止学习Keras,keras,deep-learning,conv-neural-network,transfer-learning,Keras,Deep Learning,Conv Neural Network,Transfer Learning,代码 我已经在猫和狗数据集()上使用keras实现了一个图像分类器(使用inception netowrk学习的传输)。代码运行时没有错误,但是验证集和训练集从第一个历元起的准确率保持在50%,并且损失没有减少。我用氢原子 当我移除标记层时,问题就消失了,我似乎无法理解为什么会发生这种情况。 我试图解决的问题 不同批量-4,16,64256 更改优化器-使用修改的学习率尝试adam、rmsprop、sgd 尝试对图层进行不同的激活-relu、sigmoid和leakyrelu 更改了辍学率-当辍

代码

我已经在猫和狗数据集()上使用keras实现了一个图像分类器(使用inception netowrk学习的传输)。代码运行时没有错误,但是验证集和训练集从第一个历元起的准确率保持在50%,并且损失没有减少。我用氢原子

当我移除标记层时,问题就消失了,我似乎无法理解为什么会发生这种情况。 我试图解决的问题

  • 不同批量-4,16,64256
  • 更改优化器-使用修改的学习率尝试adam、rmsprop、sgd
  • 尝试对图层进行不同的激活-relu、sigmoid和leakyrelu
  • 更改了辍学率-当辍学率为0.9时,问题消失(即,使 图层没用,这显然是有原因的,但也指出了我缺少的东西)
  • 将最终激活更改为sigmoid
  • 有人能告诉我我错过了什么,因为我想不出为什么添加一层会停止学习

    import numpy as np
    from keras.preprocessing.image import ImageDataGenerator
    from keras.models import Sequential,Model
    from keras.layers import LeakyReLU,Dropout, Flatten, Dense,Input
    from keras import applications
    from keras.preprocessing import image
    from keras import backend as K
    from keras import regularizers
    from keras.optimizers import adam
    K.set_image_dim_ordering('tf')
    input_tensor = Input(shape=(150,150,3))
    
    img_width, img_height = 150,150
    
    top_model_weights_path = 'bottleneck_fc_model.h5'
    train_data_dir = 'Cats and Dogs Dataset/train'
    validation_data_dir = 'Cats and Dogs Dataset/validation'
    nb_train_samples = 20000
    nb_validation_samples = 5000
    epochs = 50
    batch_size = 128
    
    base_model=applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_tensor=input_tensor, pooling=None)
    i=0;
    for layer in base_model.layers:
        layer.trainable = False
        i+=1
    base_model.output
    top_model=Sequential()
    top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
    top_model.add(Dense(1024,activation="relu"))
    top_model.add(Dropout(0.5))
    top_model.add(Dense(10,activation="relu"))//Layer with issue 
    top_model.add(Dropout(0.8))//
    top_model.add(Dense(2, activation='softmax'))
    model = Model(inputs=base_model.input,outputs=top_model(base_model.output))
    
    model.summary
    datagen = ImageDataGenerator(rescale=1. / 255)
    
    train_data = datagen.flow_from_directory(train_data_dir,target_size=(img_width, img_height),batch_size=batch_size,classes=[ 'cats','dogs'])#,class_mode="binary",shuffle=True)
    
    
    validation_data = datagen.flow_from_directory(validation_data_dir,target_size=(img_width, img_height), batch_size=batch_size,classes=['cats','dogs'])#,class_mode="binary",shuffle=True)
    
    adm=adam(lr=0.02)
    model.compile(optimizer=adm,loss='categorical_crossentropy', metrics=['accuracy'])
    
    model.fit_generator(train_data, steps_per_epoch=nb_train_samples//batch_size, epochs=epochs,validation_data=validation_data, shuffle=True,verbose=1)
    
    我减少了第一密集层的单位数量,同时增加了第二密集层的单位数量。。同时也降低了辍学率。。运行此代码并让我知道。更复杂的一点是,网络过度拟合的可能性更高。。辍学值的增加可能导致该层无法学习。尽量使你的网络简单


    它不起作用。我注意到的一点是,无论我更改了什么,只要我有一个额外的层,我的验证丢失就会停留在8.0590,无论我运行它的时间有多少,或者层中的节点数或激活是多少,正如我在下面所说的。。降低复杂性。。删除该层,该层在列车数据上过度拟合
    import numpy as np
    from keras.preprocessing.image import ImageDataGenerator
    from keras.models import Sequential,Model
    from keras.layers import LeakyReLU,Dropout, Flatten, Dense,Input
    from keras import applications
    from keras.preprocessing import image
    from keras import backend as K
    from keras import regularizers
    from keras.optimizers import adam
    K.set_image_dim_ordering('tf')
    input_tensor = Input(shape=(150,150,3))
    
    img_width, img_height = 150,150
    
    top_model_weights_path = 'bottleneck_fc_model.h5'
    train_data_dir = 'Cats and Dogs Dataset/train'
    validation_data_dir = 'Cats and Dogs Dataset/validation'
    nb_train_samples = 20000
    nb_validation_samples = 5000
    epochs = 50
    batch_size = 64
    
    base_model=applications.inception_v3.InceptionV3(include_top=False, weights='imagenet', input_tensor=input_tensor, pooling=None)
    i=0;
    for layer in base_model.layers:
        layer.trainable = False
        i+=1
    base_model.output
    top_model=Sequential()
    top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
    top_model.add(Dense(512,activation="relu")) //decrease in units
    top_model.add(Dropout(0.4)) // change the drop out
    top_model.add(Dense(128,activation="relu")) //increase in units
    top_model.add(Dropout(0.2)) // decrease in dropout
    top_model.add(Dense(2, activation='softmax'))
    model = Model(inputs=base_model.input,outputs=top_model(base_model.output))
    
    model.summary
    datagen = ImageDataGenerator(rescale=1. / 255)
    
    train_data = datagen.flow_from_directory(train_data_dir,target_size=(img_width, img_height),batch_size=batch_size,classes=[ 'cats','dogs'])#,class_mode="binary",shuffle=True)
    
    
    validation_data = datagen.flow_from_directory(validation_data_dir,target_size=(img_width, img_height), batch_size=batch_size,classes=['cats','dogs'])#,class_mode="binary",shuffle=True)
    
    adm=adam(lr=0.02)
    model.compile(optimizer=adm,loss='categorical_crossentropy', metrics=['accuracy'])
    
    model.fit_generator(train_data, steps_per_epoch=nb_train_samples//batch_size, epochs=epochs,validation_data=validation_data, shuffle=True,verbose=1)