Python 为什么YOLO培训损失没有显著减少&;意思是欠条没有增加?
我试图从中实现Yolo(论文没有提到它是v1,但这是第一篇论文,所以我认为它是v1)。我正在使用Keras和Tensorflow 1.x在GoogleColab上实现 TLDR结果: 起始年代:Python 为什么YOLO培训损失没有显著减少&;意思是欠条没有增加?,python,deep-learning,google-colaboratory,yolo,custom-training,Python,Deep Learning,Google Colaboratory,Yolo,Custom Training,我试图从中实现Yolo(论文没有提到它是v1,但这是第一篇论文,所以我认为它是v1)。我正在使用Keras和Tensorflow 1.x在GoogleColab上实现 TLDR结果: 起始年代: Iteration, 0 Train on 1800 samples, validate on 450 samples Epoch 1/32 1800/1800 [==============================] - 13s 7ms/step - loss: 541.8767 - mea
Iteration, 0
Train on 1800 samples, validate on 450 samples
Epoch 1/32
1800/1800 [==============================] - 13s 7ms/step - loss: 541.8767 - mean_iou_metric: 0.0040 - val_loss: 361.9846 - val_mean_iou_metric: 0.0043
Epoch 2/32
1800/1800 [==============================] - 11s 6ms/step - loss: 378.6184 - mean_iou_metric: 0.0042 - val_loss: 330.4124 - val_mean_iou_metric: 0.0043
结束历元(总共320个历元,每个循环32个):
问题:即使经历了这么多的时期,损失也减少了很多(但减少了,这是可以的),但平均借据没有增加,这是我担心的一个原因。为什么会这样?在这个阶段,我无法调试为什么iou在损失减少的情况下没有增加。这种损失值是自然的吗?这对我来说不太自然,因此一些建议将不胜感激。我怀疑我在损失函数实现中做错了什么
数据集:我使用的数据集大小为2500*256*256*3,包括白色背景和3种彩色形状矩形、三角形和圆形。不能有任何形状或最多3个形状。这些形状可以是相同的,也可以是所述的不同类型。它可以从中使用python文件生成。示例图像:
参数规格:根据本文,我将S(so SxS网格)、B(每个网格的边界框数)和C(类数)设置如下:
N=len(labels)
print("No of images, ",N)
# No of bounding boxes per grid, B
B=1
# No of grids,S*S
S=16
# No. of classes, C
C=3 #3 for 3 types of shapes
# Output=SxSx(5B+C)
I_S=256 # Image dimension I_SxI_S
classes={'circle':0,'triangle':1,'rectangle':2}
lenClasses=len(classes)
#print(lenClasses)
norm_const=I_S/S
import tensorflow as tf
from keras import backend as K
# Creating INDICES tensor to add to normalized centers
indices=np.reshape(np.arange(S),[1,S]) # consists of 0 to S-1, i.e., indices.
indices_tensor_Y=tf.constant(indices,dtype=float) # 1x S
indices_tensor_Y=tf.repeat(indices_tensor_Y,repeats=[S],axis=0) # S x S, 0123S;0123S;0123S S rows
indices_tensor_X=tf.transpose(indices_tensor_Y) # S x S
indices_tensor_Y=tf.reshape(indices_tensor_Y,[1,S,S]) # 1 x S x S
indices_tensor_X=tf.reshape(indices_tensor_X,[1,S,S]) # 1 x S x S
#indices_tensor=tf.repeat(indices_tensor,repeats=[batch_tensor],axis=0) # batch x S x S
# repeat() will repeat axis-0 (SxS), batch_tensor number of times along the channel
# IOU Calculation between two bounding boxes
def return_iou_tensor(box_true,box_pred,i):
'''
box_true=batch x S x S x 8
box_pred=batch x S x S x (5B+C)
'''
# Restored gt
cx_restored_gt_tensor=norm_const*(indices_tensor_X+box_true[:,:,:,2]) # 1 x S x S + batch x S x S = batch x S x S
cy_restored_gt_tensor=norm_const*(indices_tensor_Y+box_true[:,:,:,3]) # 1 x S x S + batch x S x S = batch x S x S
h_restored_gt_tensor=box_true[:,:,:,4]*I_S # batch x S x S
w_restored_gt_tensor=box_true[:,:,:,5]*I_S # batch x S x S
# Restored predicted
cx_restored_pred_tensor=norm_const*(indices_tensor_X+box_pred[:,:,:,B+4*i]) # 1 x S x S + batch x S x S = batch x S x S
cx_restored_pred_tensor=tf.math.maximum(cx_restored_pred_tensor,0)# To remove negative values
cy_restored_pred_tensor=norm_const*(indices_tensor_Y+box_pred[:,:,:,B+1+4*i]) # 1 x S x S + batch x S x S = batch x S x S
cy_restored_pred_tensor=tf.math.maximum(cy_restored_pred_tensor,0)# To remove negative values
h_restored_pred_tensor=box_pred[:,:,:,B+2+4*i]*I_S # batch x S x S
h_restored_pred_tensor=tf.math.maximum(h_restored_pred_tensor,0)# To remove negative values
w_restored_pred_tensor=box_pred[:,:,:,B+3+4*i]*I_S # batch x S x S
w_restored_pred_tensor=tf.math.maximum(w_restored_pred_tensor,0)# To remove negative values
# min max of intersection box all, batch x S x S
x_min_tensor=tf.math.maximum(tf.math.maximum(cx_restored_gt_tensor-w_restored_gt_tensor/2,0),tf.math.maximum(cx_restored_pred_tensor-w_restored_pred_tensor/2,0))
y_min_tensor=tf.math.maximum(tf.math.maximum(cy_restored_gt_tensor-h_restored_gt_tensor/2,0),tf.math.maximum(cy_restored_pred_tensor-h_restored_pred_tensor/2,0))
x_max_tensor=tf.math.minimum(cx_restored_gt_tensor+w_restored_gt_tensor/2,cx_restored_pred_tensor+w_restored_pred_tensor/2)
y_max_tensor=tf.math.minimum(cy_restored_gt_tensor+h_restored_gt_tensor/2,cy_restored_pred_tensor+h_restored_pred_tensor/2)
w_intersection=tf.math.maximum(x_max_tensor-x_min_tensor,0)
h_intersection=tf.math.maximum(y_max_tensor-y_min_tensor,0)
intersection_tensor=w_intersection*h_intersection # batch x S x S
union_tensor=(w_restored_gt_tensor*h_restored_gt_tensor)+(w_restored_pred_tensor*h_restored_pred_tensor) # batch x S x S
smooth=1 # We are using smooth because we dont want division by 0
return (intersection_tensor+smooth)/(union_tensor+smooth) #batch x S x S
请注意,我已经定义了一个名为norm_const的常数,该常数将用于规范化范围为0-255的地面真值中心坐标、高度和宽度
我如何规范化图像、中心坐标、高度和宽度:地面真相是一个JSON结构,在一张图像中每个形状的边界框的x1、x2、y1、y2坐标。我正在计算中心、高度和宽度值并对其进行规格化。每个网格的最终向量是[1,cx,cy,h,w,0,0,1],其中最后三个值是分类分数,第一个值是置信度分数,其余是坐标。如果栅格没有中心,则其矢量将自动为[0,0,0,0,0,0,0,0](根据numpy零点定义)
损失函数实现:
置信度得分的IoU(联合交集)实施:受其启发,IoU实施如下:
N=len(labels)
print("No of images, ",N)
# No of bounding boxes per grid, B
B=1
# No of grids,S*S
S=16
# No. of classes, C
C=3 #3 for 3 types of shapes
# Output=SxSx(5B+C)
I_S=256 # Image dimension I_SxI_S
classes={'circle':0,'triangle':1,'rectangle':2}
lenClasses=len(classes)
#print(lenClasses)
norm_const=I_S/S
import tensorflow as tf
from keras import backend as K
# Creating INDICES tensor to add to normalized centers
indices=np.reshape(np.arange(S),[1,S]) # consists of 0 to S-1, i.e., indices.
indices_tensor_Y=tf.constant(indices,dtype=float) # 1x S
indices_tensor_Y=tf.repeat(indices_tensor_Y,repeats=[S],axis=0) # S x S, 0123S;0123S;0123S S rows
indices_tensor_X=tf.transpose(indices_tensor_Y) # S x S
indices_tensor_Y=tf.reshape(indices_tensor_Y,[1,S,S]) # 1 x S x S
indices_tensor_X=tf.reshape(indices_tensor_X,[1,S,S]) # 1 x S x S
#indices_tensor=tf.repeat(indices_tensor,repeats=[batch_tensor],axis=0) # batch x S x S
# repeat() will repeat axis-0 (SxS), batch_tensor number of times along the channel
# IOU Calculation between two bounding boxes
def return_iou_tensor(box_true,box_pred,i):
'''
box_true=batch x S x S x 8
box_pred=batch x S x S x (5B+C)
'''
# Restored gt
cx_restored_gt_tensor=norm_const*(indices_tensor_X+box_true[:,:,:,2]) # 1 x S x S + batch x S x S = batch x S x S
cy_restored_gt_tensor=norm_const*(indices_tensor_Y+box_true[:,:,:,3]) # 1 x S x S + batch x S x S = batch x S x S
h_restored_gt_tensor=box_true[:,:,:,4]*I_S # batch x S x S
w_restored_gt_tensor=box_true[:,:,:,5]*I_S # batch x S x S
# Restored predicted
cx_restored_pred_tensor=norm_const*(indices_tensor_X+box_pred[:,:,:,B+4*i]) # 1 x S x S + batch x S x S = batch x S x S
cx_restored_pred_tensor=tf.math.maximum(cx_restored_pred_tensor,0)# To remove negative values
cy_restored_pred_tensor=norm_const*(indices_tensor_Y+box_pred[:,:,:,B+1+4*i]) # 1 x S x S + batch x S x S = batch x S x S
cy_restored_pred_tensor=tf.math.maximum(cy_restored_pred_tensor,0)# To remove negative values
h_restored_pred_tensor=box_pred[:,:,:,B+2+4*i]*I_S # batch x S x S
h_restored_pred_tensor=tf.math.maximum(h_restored_pred_tensor,0)# To remove negative values
w_restored_pred_tensor=box_pred[:,:,:,B+3+4*i]*I_S # batch x S x S
w_restored_pred_tensor=tf.math.maximum(w_restored_pred_tensor,0)# To remove negative values
# min max of intersection box all, batch x S x S
x_min_tensor=tf.math.maximum(tf.math.maximum(cx_restored_gt_tensor-w_restored_gt_tensor/2,0),tf.math.maximum(cx_restored_pred_tensor-w_restored_pred_tensor/2,0))
y_min_tensor=tf.math.maximum(tf.math.maximum(cy_restored_gt_tensor-h_restored_gt_tensor/2,0),tf.math.maximum(cy_restored_pred_tensor-h_restored_pred_tensor/2,0))
x_max_tensor=tf.math.minimum(cx_restored_gt_tensor+w_restored_gt_tensor/2,cx_restored_pred_tensor+w_restored_pred_tensor/2)
y_max_tensor=tf.math.minimum(cy_restored_gt_tensor+h_restored_gt_tensor/2,cy_restored_pred_tensor+h_restored_pred_tensor/2)
w_intersection=tf.math.maximum(x_max_tensor-x_min_tensor,0)
h_intersection=tf.math.maximum(y_max_tensor-y_min_tensor,0)
intersection_tensor=w_intersection*h_intersection # batch x S x S
union_tensor=(w_restored_gt_tensor*h_restored_gt_tensor)+(w_restored_pred_tensor*h_restored_pred_tensor) # batch x S x S
smooth=1 # We are using smooth because we dont want division by 0
return (intersection_tensor+smooth)/(union_tensor+smooth) #batch x S x S
以及培训期间要观察的平均iou度量:
def mean_iou_metric(y_true,y_pred):
mean_iou=0.0
for i in range(0,B):
iou_tensor=y_true[:,:,:,0]*return_iou_tensor(y_true,y_pred,i)
mean_iou=mean_iou+K.mean(iou_tensor)
return mean_iou/B
型号(基于更快的yolo):
TLDR;图片:
最新培训参数:
def mean_iou_metric(y_true,y_pred):
mean_iou=0.0
for i in range(0,B):
iou_tensor=y_true[:,:,:,0]*return_iou_tensor(y_true,y_pred,i)
mean_iou=mean_iou+K.mean(iou_tensor)
return mean_iou/B
from keras import backend as K
import tensorflow as tf
def custom_activation(x):
# LEAKY RELU
isPositive=K.cast(K.greater(x,0),K.floatx()) # U HAVE TO CAST THE OUTPUT OF COMAPARISION TO FLOAT, BOOL NOT ACCEPTED
# OUTPUT OF THIS FUNCTION IS A TENSOR
return (isPositive*x)+(1-isPositive)*0.1*x
############### BLOCK 1 ##############################
input_=Input(shape=(256,256,3),name='input')
#zeropad1=ZeroPadding2D(padding=(3,3))(input_) # PADDING MAKES 448->3+448+3, it is required to bring output to 112
convLayer1=Conv2D(64,(7,7),strides=(2,2),padding='valid',activation=custom_activation,name='conv_layer1')(input_)
maxpoolLayer1=MaxPooling2D(pool_size=(2,2),name='max_pool_layer1')(convLayer1)
#zeropad2=ZeroPadding2D(padding=(1,1))(maxpoolLayer1)
########################################################
############### BLOCK 2 ##############################
convLayer2=Conv2D(192,(3,3),padding='valid',activation=custom_activation,name='conv_layer2')(maxpoolLayer1)
maxpoolLayer2=MaxPooling2D(pool_size=(2,2),name='max_pool_layer2')(convLayer2)
#zeropad3=ZeroPadding2D(padding=(2,2))(maxpoolLayer2)
########################################################
############### BLOCK 3 ##############################
convLayer3=Conv2D(128,(1,1),padding='valid',activation=custom_activation,name='conv_layer3')(maxpoolLayer2)
convLayer4=Conv2D(256,(3,3),padding='valid',activation=custom_activation,name='conv_layer4')(convLayer3)
#convLayer5=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer5')(convLayer4)
#convLayer6=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer6')(convLayer5)
maxpoolLayer3=MaxPooling2D(pool_size=(2,2),name='max_pool_layer3')(convLayer4)
#zeropad4=ZeroPadding2D(padding=(5,5))(maxpoolLayer3)
########################################################
############### BLOCK 4 ##############################
convLayer7=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer7')(maxpoolLayer3)
convLayer8=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer8')(convLayer7)
#convLayer9=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer9')(convLayer8)
#convLayer10=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer10')(convLayer9)
#convLayer11=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer11')(convLayer10)
#convLayer12=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer12')(convLayer11)
#convLayer13=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer13')(convLayer12)
#convLayer14=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer14')(convLayer13)
#convLayer15=Conv2D(512,(1,1),padding='valid',activation=custom_activation,name='conv_layer15')(convLayer14)
#convLayer16=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer16')(convLayer15)
maxpoolLayer4=MaxPooling2D(pool_size=(2,2),name='max_pool_layer4')(convLayer8)
#zeropad5=ZeroPadding2D(padding=(4,4))(maxpoolLayer4)
###########################################################
############### BLOCK 5 ##################################
convLayer17=Conv2D(512,(1,1),padding='valid',activation=custom_activation,name='conv_layer17')(maxpoolLayer4)
convLayer18=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer18')(convLayer17)
#convLayer19=Conv2D(512,(1,1),padding='valid',activation=custom_activation,name='conv_layer19')(convLayer18)
#convLayer20=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer20')(convLayer19)
#convLayer21=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer21')(convLayer20)
#convLayer22=Conv2D(1024,(3,3),strides=(2,2),padding='valid',activation=custom_activation,name='conv_layer22')(convLayer21)
#zeropad6=ZeroPadding2D(padding=(2,2))(convLayer18)
#############################################################
################ BLOCK 6 ####################################
convLayer23=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer23')(convLayer18)
#convLayer24=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer24')(convLayer23)
flattenedLayer1=Flatten()(convLayer23) # Flatten just converts 3d matrix to 1d so that it can be connected to a next Dense Layer
###############################################################
################ BLOCK 7 #########################################
denseLayer1=Dense(units=4096,activation=custom_activation)(flattenedLayer1)
##################################################################
################ BLOCK 8 #########################################
denseLayer2=Dense(units=S*S*(5*B+C),activation='linear')(denseLayer1)
output_=Reshape((S,S,(5*B+C)))(denseLayer2) # Reshapes the 1D to 3D
##################################################################
fast_model=Model(inputs=input_,outputs=output_)
fast_model.summary()
from keras.utils import plot_model
plot_model(fast_model,to_file='unet.png',show_shapes=True)
model=fast_model # SELECT MODEL
model.save_weights('weights.hdf5')
model.compile(optimizer=Adam(learning_rate=1e-5),loss=yolo_loss_trial,metrics=[mean_iou_metric])
#model.compile(optimizer=SGD(learning_rate=1e-5),loss=yolo_loss_trial,metrics=[mean_iou_metric])
model.load_weights('weights.hdf5')
checkpointer = callbacks.ModelCheckpoint(filepath = 'weights.hdf5',save_best_only=True)
training_log = callbacks.TensorBoard(log_dir='./Model_logs')
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.2,patience=3, min_lr=1e-5,mode='auto') # ADD IN CALLBACK in fit()
# patience is after how many epochs if improvement is not seen, then reduce lr, newlr=lr*factor
for i in range(0,10):
print("Iteration, ",i)
history=model.fit(X_train,Y_train,validation_data=(X_val,Y_val),batch_size=16,epochs=32,callbacks=[training_log,checkpointer,reduce_lr],shuffle=True)
# SAVE MODEL TO DRIVE
!cp '/content/weights.hdf5' 'gdrive/My Drive/Colab Notebooks/Colab Datasets/Breast_Cancer_HNS/Images'
# CONFIRM EXECUTION TIMESTAMP
from datetime import datetime
import pytz
tz = pytz.timezone('Asia/Calcutta')
berlin_now = datetime.now(tz)
dt_string = berlin_now.strftime("%d/%m/%Y %H:%M:%S")
print(dt_string)
#from google.colab import output
#output.eval_js('new Audio("https://ssl.gstatic.com/dictionary/static/sounds/20180430/complete--_us_1.mp3").play()')
# SAVE WHOLE MODEL TO LOCAL/COLAB DRIVE
model.save("FastYolo") #Saves weights also according to official docs
# SAVE MODEL TO GOOGLE DRIVE
!cp '/content/FastYolo' '/content/gdrive/My Drive/Colab Notebooks/Colab Datasets/Shape_Detection_YOLO/None2500NoOverlap'