Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/280.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 为什么在Keras培训期间,model.evaluate()计算的指标与跟踪的指标不同?_Python_Python 2.7_Keras_Metrics - Fatal编程技术网

Python 为什么在Keras培训期间,model.evaluate()计算的指标与跟踪的指标不同?

Python 为什么在Keras培训期间,model.evaluate()计算的指标与跟踪的指标不同?,python,python-2.7,keras,metrics,Python,Python 2.7,Keras,Metrics,我使用Keras2.0.4(TensorFlow后端)进行图像分类任务(基于预训练模型)。 在培训/调整期间,我使用CSVLogger-跟踪所有使用的度量(例如category\u accurity,category crossentropy),包括与验证集相关联的相应度量(即val\u category\u accurity,val\u categegority\u crossentropy) 通过回调ModelCheckpoint我正在跟踪权重的最佳配置(save\u best\u only

我使用Keras2.0.4(TensorFlow后端)进行图像分类任务(基于预训练模型)。 在培训/调整期间,我使用
CSVLogger
-跟踪所有使用的度量(例如
category\u accurity
category crossentropy
),包括与验证集相关联的相应度量(即
val\u category\u accurity
val\u categegority\u crossentropy

通过回调
ModelCheckpoint
我正在跟踪权重的最佳配置(
save\u best\u only=True
)。为了评估验证集中的模型,我使用
model.evaluate()

我的期望是:
CSVLogger
(最佳纪元)跟踪的指标等于
model.evaluate()计算的指标。
不幸的是,情况并非如此。指标差异为+-5%。
这种行为有什么原因吗


电子数据交换:

经过一些测试,我可以获得一些见解:

  • 如果我没有将生成器用于培训和验证数据(因此没有
    model.fit\u generator()
    ),则不会出现问题。-->使用
    ImageDataGenerator
    进行培训和验证数据是差异的来源。(请注意,在计算
    评估
    I时,不要使用生成器,但我do使用相同的验证数据(至少如果
    DataImageGenerator
    能够按预期工作……。
    我认为ImageDataGenerator不能正常工作(请, 还可以看一看)
  • 如果我根本不使用生成器,就不会有这个问题。Id est通过
    CSVLogger
    (最佳”纪元)跟踪的度量等于通过
    model.evaluate()计算的度量值
    有趣的是,还有另一个问题:如果您使用相同的数据进行培训和验证,那么在每个历元结束时,培训指标(例如
    损失
    )和验证指标(例如
    val\u损失
    )之间会存在差异。
    ()
  • 使用的代码:

    ############################ import section ############################
    from __future__ import print_function # perform like in python 3.x
    from keras.datasets import mnist
    from keras.utils import np_utils # numpy utils for to_categorical()
    from keras.models import Model, load_model
    from keras.layers import Dense, GlobalAveragePooling2D, Dropout, GaussianDropout, Conv2D, MaxPooling2D
    from keras.optimizers import SGD, Adam
    from keras import backend as K
    from keras.preprocessing.image import ImageDataGenerator 
    from keras import metrics
    import os
    import sys
    from scipy import misc
    import numpy as np
    from keras.applications.vgg16 import preprocess_input as vgg16_preprocess_input
    from keras.applications import VGG16
    from keras.callbacks import CSVLogger, ModelCheckpoint
    
    
    ############################ manual settings ###########################
    # general settings
    seed = 1337
    
    loss_function = 'categorical_crossentropy'
    
    learning_rate = 0.001
    
    epochs = 10
    
    batch_size = 20
    
    nb_classes = 5 
    
    img_width, img_height = 400, 400 # >= 48 necessary, as VGG16 is used
    
    chosen_optimizer = SGD(lr=learning_rate, momentum=0.0, decay=0.0, nesterov=False)
    
    steps_per_epoch = 40 // batch_size  # 40 train samples in 5 classes
    validation_steps = 40 // batch_size # 40 train samples in 5 classes
    
    data_dir = # TODO: set path where data is stored (folders: 'train', 'val', 'test'; within each folder are folders named by classes)
    
    # callbacks: CSVLogger & ModelCheckpoint
    filepath = # TODO: set path, where you want to store files generated by the callbacks
    file_best_checkpoint= 'best_epoch.hdf5'
    file_csvlogger = 'logged_metrics.txt'
    
    modelcheckpoint_best_epoch= ModelCheckpoint(filepath=os.path.join(filepath, file_best_checkpoint), 
                                      monitor = 'val_loss' , verbose = 1, 
                                      save_best_only = True, 
                                      save_weights_only=False, mode='auto', 
                                      period=1) # every epoch executed
    csvlogger = CSVLogger(os.path.join(filepath, file_csvlogger) , separator=',', append=False)
    
    
    
    ############################ prepare data ##############################
    # get validation data (for evaluation)
    X_val, Y_val = # TODO: load train data (4darray, samples, img_width, img_height, nb_channels) IMPORTANT: 5 classes with 8 images each.
    
    # preprocess data
    my_preprocessing_function = mf.my_vgg16_preprocess_input
    
    # 'augmentation' configuration we will use for training
    train_datagen = ImageDataGenerator(preprocessing_function = my_preprocessing_function) # only preprocessing; static data set
    
    # 'augmentation' configuration we will use for validation
    val_datagen = ImageDataGenerator(preprocessing_function = my_preprocessing_function) # only preprocessing; static data set
    
    train_data_dir = os.path.join(data_dir, 'train')
    validation_data_dir = os.path.join(data_dir, 'val')
    train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        shuffle = True,
        seed = seed, # random seed for shuffling and transformations
        class_mode='categorical')  # label type (categorical = one-hot vector)
    
    validation_generator = val_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        shuffle = True,
        seed = seed, # random seed for shuffling and transformations
        class_mode='categorical')  # label type (categorical = one-hot vector)
    
    
    
    ############################## training ###############################
    print("\n---------------------------------------------------------------")
    print("------------------------ training model -----------------------")
    print("---------------------------------------------------------------")
    # create the base pre-trained model
    base_model = VGG16(include_top=False, weights = None, input_shape=(img_width, img_height, 3), pooling = 'max', classes = nb_classes)
    model_name =  "VGG_modified"
    
    # do not freeze any layers --> all layers trainable
    for layer in base_model.layers:
        layer.trainable = True
    
    # define topping of base_model
    x = base_model.output # get the last layer of our base_model
    x = Dense(1024, activation='relu', name='fc1')(x)
    x = Dense(1024, activation='relu', name='fc2')(x)
    predictions = Dense(nb_classes, activation='softmax', name='predictions')(x)
    
    # finally, stack model together
    model = Model(outputs=predictions, name= model_name, inputs=base_model.input) #Keras 1.x.x: model = Model(input=base_model.input, output=predictions) 
    print(model.summary())
    
    # compile the model (should be done *after* setting layers to non-trainable)
    model.compile(optimizer = chosen_optimizer, loss=loss_function, 
                metrics=['categorical_accuracy','kullback_leibler_divergence'])
    
    # train the model on your data
    model.fit_generator(
        train_generator,
        steps_per_epoch=steps_per_epoch,
        epochs=epochs,
        validation_data=validation_generator,
        validation_steps=validation_steps,
        callbacks = [csvlogger, modelcheckpoint_best_epoch])
    
    
    
    ############################## evaluation ##############################
    print("\n\n---------------------------------------------------------------")
    print("------------------ Evaluation of Best Epoch -------------------")
    print("---------------------------------------------------------------")
    # load model (corresponding to best training epoch)
    model = load_model(os.path.join(filepath, file_best_checkpoint))
    
    # evaluate model on validation data (in test mode!)
    list_of_metrics = model.evaluate(X_val, Y_val, batch_size=batch_size, verbose=1, sample_weight=None)
    index = 0
    print('\nMetrics:')
    for metric in model.metrics_names:
        print(metric+ ':' , str(list_of_metrics[index]))
        index += 1
    

    E D I T 2
    参见第1节的内容: 如果我在培训和评估期间使用相同的生成器进行验证数据(通过使用
    evaluate\u generator()
    ),问题仍然会发生。
    因此,这肯定是由生成器引起的问题…

    仅对验证数据集上的度量进行评估

    在培训期间,在培训数据集上计算的度量值并不反映该模型在该纪元结束时的真实度量值,因为该模型将在每个批次更新(修改)


    这有帮助吗?

    CSVLogger在每个历元后跟踪验证集上的度量。我们假设,最后一个历元将导致权重的最佳配置。这意味着,验证集上最后跟踪的度量是在验证集上进行评估时的度量。我遗漏了什么?嗯,用于保存最佳onl的度量是什么y?监控数量是验证丢失(
    val\u categorical\u crossentropy
    )事实上这不重要……很抱歉,我也被困在这个案例中。理想情况下,您应该提出一些代码,以便我们重现您的问题并帮助解决问题:-)我跟踪了问题。请参见上面的编辑