Machine learning CNN架构相同,但得到的结果不同

Machine learning CNN架构相同,但得到的结果不同,machine-learning,keras,deep-learning,Machine Learning,Keras,Deep Learning,我有一个CNN,它使用VGG16体系结构保存训练和测试数据的瓶颈特性,然后将这些特性上传到我的自定义完全连接的层以对图像进行分类 #create data augmentations for training set; helps reduce overfitting and find more features train_datagen = ImageDataGenerator(rescale=1./255, shear_range = 0.2

我有一个CNN,它使用VGG16体系结构保存训练和测试数据的瓶颈特性,然后将这些特性上传到我的自定义完全连接的层以对图像进行分类

#create data augmentations for training set; helps reduce overfitting and find more features
train_datagen = ImageDataGenerator(rescale=1./255,
                        shear_range = 0.2,
                        zoom_range = 0.2,
                        horizontal_flip=True)

#use ImageDataGenerator to upload validation images; data augmentation not necessary for 
validating process
val_datagen = ImageDataGenerator(rescale=1./255)

#load VGG16 model, pretrained on imagenet database
model = applications.VGG16(include_top=False, weights='imagenet')

#generator to load images into NN
train_generator = train_datagen.flow_from_directory(
        train_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

#total number of images used for training data
num_train = len(train_generator.filenames)

#save features to numpy array file so features do not overload memory
bottleneck_features_train = model.predict_generator(train_generator, num_train // batch_size)

val_generator = val_datagen.flow_from_directory(
        val_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

num_val = len(val_generator.filenames)

bottleneck_features_validation = model.predict_generator(val_generator, num_val // batch_size)`



#used to retrieve the labels of the images
label_datagen = ImageDataGenerator(rescale=1./255)

#generators can create class labels for each image in either 
train_label_generator = label_datagen.flow_from_directory(  
    train_dir,  
    target_size=(img_width, img_height),  
    batch_size=batch_size,  
    class_mode=None,  
    shuffle=False)  

#total number of images used for training data
num_train = len(train_label_generator.filenames)

#load features from VGG16 and pair each image with corresponding label (0 for normal, 1 for pneumonia)
#train_data = np.load('xray/bottleneck_features_train.npy')
#get the class labels generated by train_label_generator 
train_labels = train_label_generator.classes

val_label_generator = label_datagen.flow_from_directory(  
    val_dir,  
    target_size=(img_width, img_height),  
    batch_size=batch_size,  
    class_mode=None,  
    shuffle=False)

num_val = len(val_label_generator.filenames)

#val_data = np.load('xray/bottleneck_features_validation.npy')
val_labels = val_label_generator.classes

#create fully connected layers, replacing the ones cut off from the VGG16 model
model = Sequential()
#converts model's expected input dimensions to same shape as bottleneck feature arrays 
model.add(Flatten(input_shape=bottleneck_features_train.shape[1:]))
#ignores a fraction of input neurons so they do not become co-dependent on each other; helps prevent overfitting
model.add(Dropout(0.7))
#normal fully-connected layer with relu activation. Replaces all negative inputs with 0 and does not fire neuron,
#creating a lighetr network
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.7))
#output layer to classify 0 or 1
model.add(Dense(1, activation='sigmoid'))

#compile model and specify which optimizer and loss function to use
#optimizer used to update the weights to optimal values; adam optimizer maintains seperate learning rates
#for each weight and updates accordingly

#cross-entropy function measures the ability of model to correctly classify 0 or 1
model.compile(optimizer=optimizers.Adam(lr=0.0007), loss='binary_crossentropy', metrics=['accuracy'])

#used to stop training if NN shows no improvement for 5 epochs
early_stop = EarlyStopping(monitor='val_loss', min_delta=0.01, patience=5, verbose=1)

#checks each epoch as it runs and saves the weight file from the model with the lowest validation loss
checkpointer = ModelCheckpoint(filepath=top_model_weights_dir, verbose=1, save_best_only=True)

#fit the model to the data
history = model.fit(bottleneck_features_train, train_labels,
        epochs=epochs,
        batch_size=batch_size,
        callbacks = [early_stop, checkpointer],
        verbose=2,
        validation_data=(bottleneck_features_validation, val_labels))`
在调用train_top_model()后,CNN在大约10个时代后获得了86%的准确率

然而,当我尝试在中通过直接在VGG16层之上构建完全连接的层来实现此体系结构时,网络的val_acc为0.5000,基本上无法训练。代码有什么问题吗

epochs = 10
batch_size = 20

train_datagen = ImageDataGenerator(rescale=1./255,
                        shear_range = 0.2,
                        zoom_range = 0.2,
                        horizontal_flip=True)

val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='binary',
        shuffle=False)

num_train = len(train_generator.filenames)

val_generator = val_datagen.flow_from_directory(
        val_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='binary',
        shuffle=False)

num_val = len(val_generator.filenames)`

base_model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_width, 
img_height, 3))

x = base_model.output
x = Flatten()(x)
x = Dropout(0.7)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.7)(x)
predictions = Dense(1, activation='sigmoid')(x)


model = Model(inputs=base_model.input, outputs=predictions)

for layer in model.layers[:19]:
layer.trainable = False

checkpointer = ModelCheckpoint(filepath=top_model_weights_dir, verbose=1, save_best_only=True)

model.compile(optimizer=optimizers.Adam(lr=0.0007), loss='binary_crossentropy', metrics= 
['accuracy'])

history = model.fit_generator(train_generator, 
                          steps_per_epoch=(num_train//batch_size), 
                          validation_data=val_generator,
                          validation_steps=(num_val//batch_size),
                          callbacks=[checkpointer], 
                          verbose=1,
                          epochs=epochs)

原因是在第二种方法中,您没有冻结VGG16层。换句话说,你是在训练整个网络。而在第一种方法中,您只是训练完全连接层的权重。 使用类似以下内容:

for layer in base_model.layers[:end_layer]:
        layer.trainable = False

其中,end_layer是要导入的最后一个层

这两种培训程序不同。但让我困惑的是,第一个模型取得了更好的准确性。我希望不是这样。顺便说一句,你有没有尝试过在第二种方法中采用较低或较高的学习率?你能试试这个并报告结果吗。已经完成:
用于model.layers[:19]:layer.trainable=False