Python 如何在机器学习中确定恒定的验证精度?
我正在尝试使用预先训练的Python 如何在机器学习中确定恒定的验证精度?,python,machine-learning,keras,computer-vision,Python,Machine Learning,Keras,Computer Vision,我正在尝试使用预先训练的InceptionV3模型对具有平衡类的dicom图像进行图像分类 def convertDCM(PathDCM) : data = [] for dirName, subdir, files in os.walk(PathDCM): for filename in sorted(files): ds = pydicom.dcmread(PathDCM +'/' + filename)
InceptionV3
模型对具有平衡类的dicom图像进行图像分类
def convertDCM(PathDCM) :
data = []
for dirName, subdir, files in os.walk(PathDCM):
for filename in sorted(files):
ds = pydicom.dcmread(PathDCM +'/' + filename)
im = fromarray(ds.pixel_array)
im = keras.preprocessing.image.img_to_array(im)
im = cv2.resize(im,(299,299))
data.append(im)
return data
PathDCM = '/home/Desktop/FULL_BALANCED_COLOURED/'
data = convertDCM(PathDCM)
#scale the raw pixel intensities to the range [0,1]
data = np.array(data, dtype="float")/255.0
labels = np.array(labels,dtype ="int")
#splitting data into training and testing
#test_size is percentage to split into test/train data
(trainX, testX, trainY, testY) = train_test_split(
data,labels,
test_size=0.2,
random_state=42)
img_width, img_height = 299, 299 #InceptionV3 size
train_samples = 300
validation_samples = 50
epochs = 25
batch_size = 15
base_model = keras.applications.InceptionV3(
weights ='imagenet',
include_top=False,
input_shape = (img_width,img_height,3))
model_top = keras.models.Sequential()
model_top.add(keras.layers.GlobalAveragePooling2D(input_shape=base_model.output_shape[1:], data_format=None)),
model_top.add(keras.layers.Dense(300,activation='relu'))
model_top.add(keras.layers.Dropout(0.5))
model_top.add(keras.layers.Dense(1, activation = 'sigmoid'))
model = keras.models.Model(inputs = base_model.input, outputs = model_top(base_model.output))
#Compiling model
model.compile(optimizer = keras.optimizers.Adam(
lr=0.0001),
loss='binary_crossentropy',
metrics=['accuracy'])
#Image Processing and Augmentation
train_datagen = keras.preprocessing.image.ImageDataGenerator(
rescale = 1./255,
zoom_range = 0.1,
width_shift_range = 0.2,
height_shift_range = 0.2,
horizontal_flip = True,
fill_mode ='nearest')
val_datagen = keras.preprocessing.image.ImageDataGenerator()
train_generator = train_datagen.flow(
trainX,
trainY,
batch_size=batch_size,
shuffle=True)
validation_generator = train_datagen.flow(
testX,
testY,
batch_size=batch_size,
shuffle=True)
当我训练模型时,我总是得到一个恒定的验证精度0.3889
,验证损失波动
#Training the model
history = model.fit_generator(
train_generator,
steps_per_epoch = train_samples//batch_size,
epochs = epochs,
validation_data = validation_generator,
validation_steps = validation_samples//batch_size)
Epoch 1/25
20/20 [==============================]20/20
[==============================] - 195s 49s/step - loss: 0.7677 - acc: 0.4020 - val_loss: 0.7784 - val_acc: 0.3889
Epoch 2/25
20/20 [==============================]20/20
[==============================] - 187s 47s/step - loss: 0.7016 - acc: 0.4848 - val_loss: 0.7531 - val_acc: 0.3889
Epoch 3/25
20/20 [==============================]20/20
[==============================] - 191s 48s/step - loss: 0.6566 - acc: 0.6304 - val_loss: 0.7492 - val_acc: 0.3889
Epoch 4/25
20/20 [==============================]20/20
[==============================] - 175s 44s/step - loss: 0.6533 - acc: 0.5529 - val_loss: 0.7575 - val_acc: 0.3889
predictions= model.predict(testX)
print(predictions)
预测模型也仅返回每个图像一个预测的数组:
[[0.457804 ]
[0.45051473]
[0.48343503]
[0.49180537]...
为什么模型只预测了两类中的一类?这是否与恒定的val精度或可能的过度拟合有关?如果有两个类,则每个图像都在一个或另一个类中,因此一个类的概率足以找到所有内容,因为每个图像的概率之和应为1。如果你有一个类的概率p,另一个类的概率是1-p 如果您希望能够对不属于这两个类之一的图像进行分类,那么您应该创建第三个类 此外,这一行:
model_top.add(keras.layers.Dense(1, activation = 'sigmoid'))
这意味着输出是一个形状向量(nb_样本,1),并且与训练标签的形状相同好的,这对预测是有意义的,但是你知道恒定验证精度的原因吗?恒定精度有很多原因。好的一点是,只有交叉验证精度是恒定的。这意味着,即使您正在火车数据集上学习,它也不会改变测试集图像的分类。主要原因通常是这两个数据集太小,因此彼此差异太大。一直走到你的25个时代,看看是否有任何变化。如果没有,请尝试向NN添加更多datas@student17请您接受结束主题的答案。您的培训和验证集太小,无法分别进行有效的培训和稳定的验证。。。