Keras 为什么模型的训练精度在一定时期内增长非常缓慢(几乎保持稳定)?
我正在处理CIFAR-10数据集,并试图获得基准或至少90%的准确性。我已经尝试了下面提到的所有方法,但大多数都会产生相同的结果,那就是经过一段时间后,训练精度没有提高,并且保持稳定,验证精度也略有波动。 数据集目录如下所示:Keras 为什么模型的训练精度在一定时期内增长非常缓慢(几乎保持稳定)?,keras,deep-learning,conv-neural-network,Keras,Deep Learning,Conv Neural Network,我正在处理CIFAR-10数据集,并试图获得基准或至少90%的准确性。我已经尝试了下面提到的所有方法,但大多数都会产生相同的结果,那就是经过一段时间后,训练精度没有提高,并且保持稳定,验证精度也略有波动。 数据集目录如下所示: \cifar \train(total 40,000 images. 4000 images per class. Total 10 classes) \airplane \automobile......(similar struc
\cifar
\train(total 40,000 images. 4000 images per class. Total 10 classes)
\airplane
\automobile......(similar structure for test and validation as well)
\test.(total 10,000 images. 1000 images per class)
\validation.(total 10,000 images. 1000 images per class)
code.py
我已尝试使用以下参数:
import keras
from keras.layers import Conv2D, MaxPool2D, Flatten, Dense, Dropout,
Activation, BatchNormalization, GlobalAveragePooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers, regularizers
classifier = Sequential()
classifier.add(Conv2D(filters=64, kernel_size=(3,3), input_shape= (32,32,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(Conv2D(filters=64, kernel_size=(3,3), use_bias=False))
classifier.add(BatchNormalization())
classifier.add(Activation('relu'))
classifier.add(MaxPool2D(pool_size=(2,2)))
classifier.add(GlobalAveragePooling2D())
classifier.add(Dense(units=10,activation='softmax'))
'''
sgd = optimizers.SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
'''
classifier.compile(optimizer='nadam', loss='categorical_crossentropy', metrics=['accuracy'])
train_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True,
shear_range=0.2,rotation_range=20, width_shift_range=0.2,
height_shift_range=0.2,horizontal_flip=True)
test_datagen=ImageDataGenerator(rescale=1./255,
featurewise_center=True,featurewise_std_normalization=True)
train_dataset=train_datagen.flow_from_directory(
directory='cifar/train', target_size=(32,32),
batch_size=16, class_mode='categorical')
test_dataset=test_datagen.flow_from_directory(
directory='cifar/validation', target_size=(32,32),
batch_size=16, class_mode='categorical')
classifier.fit_generator(train_dataset,
steps_per_epoch=2500, epochs=50,
validation_data=test_dataset, validation_steps=625)
以下是历代观测结果:
Epoch 17/50
2500/2500 [==============================] - 259s 103ms/step - loss:
0.9305 - acc: 0.6840 - val_loss: 0.8195 - val_acc: 0.7111
Epoch 18/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9280 - acc: 0.6817 - val_loss: 0.9981 - val_acc: 0.6816
Epoch 19/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.9112 - acc: 0.6896 - val_loss: 0.9393 - val_acc: 0.6786
Epoch 20/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.9053 - acc: 0.6881 - val_loss: 0.8509 - val_acc: 0.7172
Epoch 21/50
2500/2500 [==============================] - 259s 104ms/step - loss:
0.9110 - acc: 0.6874 - val_loss: 0.8427 - val_acc: 0.7211
Epoch 22/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8967 - acc: 0.6944 - val_loss: 0.7139 - val_acc: 0.7592
Epoch 23/50
2500/2500 [==============================] - 257s 103ms/step - loss:
0.8825 - acc: 0.6967 - val_loss: 0.8611 - val_acc: 0.7066
Epoch 24/50
2500/2500 [==============================] - 260s 104ms/step - loss:
0.8819 - acc: 0.6967 - val_loss: 0.7436 - val_acc: 0.7447
Epoch 25/50
2500/2500 [==============================] - 270s 108ms/step - loss:
0.8780 - acc: 0.6995 - val_loss: 0.8129 - val_acc: 0.7310
Epoch 26/50
2500/2500 [==============================] - 279s 112ms/step - loss:
0.8756 - acc: 0.7010 - val_loss: 0.7890 - val_acc: 0.7276
Epoch 27/50
2500/2500 [==============================] - 283s 113ms/step - loss:
0.8680 - acc: 0.7027 - val_loss: 0.8185 - val_acc: 0.7307
Epoch 28/50
2500/2500 [==============================] - 287s 115ms/step - loss:
0.8651 - acc: 0.7043 - val_loss: 0.7457 - val_acc: 0.7460
Epoch 29/50
2500/2500 [==============================] - 286s 114ms/step - loss:
0.8531 - acc: 0.7065 - val_loss: 1.1669 - val_acc: 0.6483
Epoch 30/50
2500/2500 [==============================] - 290s 116ms/step - loss:
0.8521 - acc: 0.7085 - val_loss: 0.7221 - val_acc: 0.7565
Epoch 31/50
2500/2500 [==============================] - 289s 116ms/step - loss:
0.8518 - acc: 0.7072 - val_loss: 0.7308 - val_acc: 0.7549
Epoch 32/50
2500/2500 [==============================] - 291s 116ms/step - loss:
0.8465 - acc: 0.7119 - val_loss: 0.8550 - val_acc: 0.7182
Epoch 33/50
2500/2500 [==============================] - 302s 121ms/step - loss:
0.8406 - acc: 0.7121 - val_loss: 1.0259 - val_acc: 0.6770
Epoch 34/50
2500/2500 [==============================] - 286s 115ms/step - loss:
0.8424 - acc: 0.7120 - val_loss: 0.6924 - val_acc: 0.7646
Epoch 35/50
2500/2500 [==============================] - 273s 109ms/step - loss:
0.8337 - acc: 0.7143 - val_loss: 0.8744 - val_acc: 0.7220
Epoch 36/50
2500/2500 [==============================] - 285s 114ms/step - loss:
0.8332 - acc: 0.7144 - val_loss: 1.0132 - val_acc: 0.6753
Epoch 37/50
2500/2500 [==============================] - 275s 110ms/step - loss:
0.8382 - acc: 0.7122 - val_loss: 0.7873 - val_acc: 0.7366
我是一个深度学习的初学者,所以如果我犯了任何愚蠢的错误,请原谅我。请指导我如何进一步进行。降低您的学习率。太高了会降低你的学习速度。它太高了您的验证精度和损失是振荡的,这可能是由于优化器中的动量值过高造成的。我建议你也试着降低学习率或增加衰减。也许改变卷积层中的滤波器数量会更好。从一层64个过滤器开始,另两层128个过滤器开始。你可以读到类似的东西,它可能是helpfull@Eric谢谢你的建议。我试试看。此外,卷积层的顺序,即(过滤器)-(128128,64)或(128,64128)或(64128128)是否会显著影响结果?我认为是这样,因为每一层都学习一个特定的属性(轮廓、区域…)。但我之所以这么说,是因为通常的配置是(64128,128),了解它的最好方法是比较这3个共同配置的结果。好的,谢谢。我也会试试@Eric。您的验证精度和损失是振荡的,这可能是因为优化器中的动量值很高。我建议你也试着降低学习率或增加衰减。也许改变卷积层中的滤波器数量会更好。从一层64个过滤器开始,另两层128个过滤器开始。你可以读到类似的东西,它可能是helpfull@Eric谢谢你的建议。我试试看。此外,卷积层的顺序,即(过滤器)-(128128,64)或(128,64128)或(64128128)是否会显著影响结果?我认为是这样,因为每一层都学习一个特定的属性(轮廓、区域…)。但我之所以这么说,是因为通常的配置是(64128,128),了解它的最好方法是比较这3个共同配置的结果。好的,谢谢。我也会试试的@Eric。