Python 为什么YOLO培训损失没有显著减少&；意思是欠条没有增加？_Python_Deep Learning_Google Colaboratory_Yolo_Custom Training

Python 为什么YOLO培训损失没有显著减少&；意思是欠条没有增加？

python deep-learning google-colaboratory

Python 为什么YOLO培训损失没有显著减少&；意思是欠条没有增加？,python,deep-learning,google-colaboratory,yolo,custom-training,Python,Deep Learning,Google Colaboratory,Yolo,Custom Training,我试图从中实现Yolo（论文没有提到它是v1，但这是第一篇论文，所以我认为它是v1）。我正在使用Keras和Tensorflow 1.x在GoogleColab上实现 TLDR结果：起始年代： Iteration, 0 Train on 1800 samples, validate on 450 samples Epoch 1/32 1800/1800 [==============================] - 13s 7ms/step - loss: 541.8767 - mea

我试图从中实现Yolo（论文没有提到它是v1，但这是第一篇论文，所以我认为它是v1）。我正在使用Keras和Tensorflow 1.x在GoogleColab上实现

TLDR结果：

起始年代：

Iteration,  0
Train on 1800 samples, validate on 450 samples
Epoch 1/32
1800/1800 [==============================] - 13s 7ms/step - loss: 541.8767 - mean_iou_metric: 0.0040 - val_loss: 361.9846 - val_mean_iou_metric: 0.0043
Epoch 2/32
1800/1800 [==============================] - 11s 6ms/step - loss: 378.6184 - mean_iou_metric: 0.0042 - val_loss: 330.4124 - val_mean_iou_metric: 0.0043

结束历元（总共320个历元，每个循环32个）：

问题：即使经历了这么多的时期，损失也减少了很多（但减少了，这是可以的），但平均借据没有增加，这是我担心的一个原因。为什么会这样？在这个阶段，我无法调试为什么iou在损失减少的情况下没有增加。这种损失值是自然的吗？这对我来说不太自然，因此一些建议将不胜感激。我怀疑我在损失函数实现中做错了什么

数据集：我使用的数据集大小为2500*256*256*3，包括白色背景和3种彩色形状矩形、三角形和圆形。不能有任何形状或最多3个形状。这些形状可以是相同的，也可以是所述的不同类型。它可以从中使用python文件生成。示例图像：

参数规格：根据本文，我将S（so SxS网格）、B（每个网格的边界框数）和C（类数）设置如下：

N=len(labels)
print("No of images, ",N)
# No of bounding boxes per grid, B
B=1
# No of grids,S*S
S=16
# No. of classes, C
C=3 #3 for 3 types of shapes
# Output=SxSx(5B+C)
I_S=256 # Image dimension I_SxI_S
classes={'circle':0,'triangle':1,'rectangle':2}
lenClasses=len(classes)
#print(lenClasses)
norm_const=I_S/S

import tensorflow as tf
from keras import backend as K

# Creating INDICES tensor to add to normalized centers
indices=np.reshape(np.arange(S),[1,S]) # consists of 0 to S-1, i.e., indices.
indices_tensor_Y=tf.constant(indices,dtype=float) # 1x S
indices_tensor_Y=tf.repeat(indices_tensor_Y,repeats=[S],axis=0) # S x S, 0123S;0123S;0123S S rows
indices_tensor_X=tf.transpose(indices_tensor_Y) # S x S
indices_tensor_Y=tf.reshape(indices_tensor_Y,[1,S,S]) # 1 x S x S
indices_tensor_X=tf.reshape(indices_tensor_X,[1,S,S]) # 1 x S x S
#indices_tensor=tf.repeat(indices_tensor,repeats=[batch_tensor],axis=0) # batch x S x S
# repeat() will repeat axis-0 (SxS), batch_tensor number of times along the channel

# IOU Calculation between two bounding boxes
def return_iou_tensor(box_true,box_pred,i):
  '''
  box_true=batch x S x S x 8
  box_pred=batch x S x S x (5B+C)
  '''
  
  # Restored gt
  cx_restored_gt_tensor=norm_const*(indices_tensor_X+box_true[:,:,:,2]) # 1 x S x S + batch x S x S = batch x S x S
  cy_restored_gt_tensor=norm_const*(indices_tensor_Y+box_true[:,:,:,3]) # 1 x S x S + batch x S x S = batch x S x S
  h_restored_gt_tensor=box_true[:,:,:,4]*I_S # batch x S x S
  w_restored_gt_tensor=box_true[:,:,:,5]*I_S # batch x S x S

  # Restored predicted
  cx_restored_pred_tensor=norm_const*(indices_tensor_X+box_pred[:,:,:,B+4*i]) # 1 x S x S + batch x S x S = batch x S x S
  cx_restored_pred_tensor=tf.math.maximum(cx_restored_pred_tensor,0)# To remove negative values
  cy_restored_pred_tensor=norm_const*(indices_tensor_Y+box_pred[:,:,:,B+1+4*i]) # 1 x S x S + batch x S x S = batch x S x S
  cy_restored_pred_tensor=tf.math.maximum(cy_restored_pred_tensor,0)# To remove negative values
  h_restored_pred_tensor=box_pred[:,:,:,B+2+4*i]*I_S # batch x S x S
  h_restored_pred_tensor=tf.math.maximum(h_restored_pred_tensor,0)# To remove negative values
  w_restored_pred_tensor=box_pred[:,:,:,B+3+4*i]*I_S # batch x S x S
  w_restored_pred_tensor=tf.math.maximum(w_restored_pred_tensor,0)# To remove negative values

  # min max of intersection box all, batch x S x S
  x_min_tensor=tf.math.maximum(tf.math.maximum(cx_restored_gt_tensor-w_restored_gt_tensor/2,0),tf.math.maximum(cx_restored_pred_tensor-w_restored_pred_tensor/2,0))
  y_min_tensor=tf.math.maximum(tf.math.maximum(cy_restored_gt_tensor-h_restored_gt_tensor/2,0),tf.math.maximum(cy_restored_pred_tensor-h_restored_pred_tensor/2,0))
  x_max_tensor=tf.math.minimum(cx_restored_gt_tensor+w_restored_gt_tensor/2,cx_restored_pred_tensor+w_restored_pred_tensor/2)
  y_max_tensor=tf.math.minimum(cy_restored_gt_tensor+h_restored_gt_tensor/2,cy_restored_pred_tensor+h_restored_pred_tensor/2)
  w_intersection=tf.math.maximum(x_max_tensor-x_min_tensor,0)
  h_intersection=tf.math.maximum(y_max_tensor-y_min_tensor,0)
  intersection_tensor=w_intersection*h_intersection # batch x S x S
  union_tensor=(w_restored_gt_tensor*h_restored_gt_tensor)+(w_restored_pred_tensor*h_restored_pred_tensor) # batch x S x S
  smooth=1 # We are using smooth because we dont want division by 0
  return (intersection_tensor+smooth)/(union_tensor+smooth) #batch x S x S

请注意，我已经定义了一个名为norm_const的常数，该常数将用于规范化范围为0-255的地面真值中心坐标、高度和宽度

我如何规范化图像、中心坐标、高度和宽度：地面真相是一个JSON结构，在一张图像中每个形状的边界框的x1、x2、y1、y2坐标。我正在计算中心、高度和宽度值并对其进行规格化。每个网格的最终向量是[1，cx，cy，h，w，0,0,1]，其中最后三个值是分类分数，第一个值是置信度分数，其余是坐标。如果栅格没有中心，则其矢量将自动为[0,0,0,0,0,0,0,0]（根据numpy零点定义）

损失函数实现：

置信度得分的IoU（联合交集）实施：受其启发，IoU实施如下：

N=len(labels)
print("No of images, ",N)
# No of bounding boxes per grid, B
B=1
# No of grids,S*S
S=16
# No. of classes, C
C=3 #3 for 3 types of shapes
# Output=SxSx(5B+C)
I_S=256 # Image dimension I_SxI_S
classes={'circle':0,'triangle':1,'rectangle':2}
lenClasses=len(classes)
#print(lenClasses)
norm_const=I_S/S

import tensorflow as tf
from keras import backend as K

# Creating INDICES tensor to add to normalized centers
indices=np.reshape(np.arange(S),[1,S]) # consists of 0 to S-1, i.e., indices.
indices_tensor_Y=tf.constant(indices,dtype=float) # 1x S
indices_tensor_Y=tf.repeat(indices_tensor_Y,repeats=[S],axis=0) # S x S, 0123S;0123S;0123S S rows
indices_tensor_X=tf.transpose(indices_tensor_Y) # S x S
indices_tensor_Y=tf.reshape(indices_tensor_Y,[1,S,S]) # 1 x S x S
indices_tensor_X=tf.reshape(indices_tensor_X,[1,S,S]) # 1 x S x S
#indices_tensor=tf.repeat(indices_tensor,repeats=[batch_tensor],axis=0) # batch x S x S
# repeat() will repeat axis-0 (SxS), batch_tensor number of times along the channel

# IOU Calculation between two bounding boxes
def return_iou_tensor(box_true,box_pred,i):
  '''
  box_true=batch x S x S x 8
  box_pred=batch x S x S x (5B+C)
  '''
  
  # Restored gt
  cx_restored_gt_tensor=norm_const*(indices_tensor_X+box_true[:,:,:,2]) # 1 x S x S + batch x S x S = batch x S x S
  cy_restored_gt_tensor=norm_const*(indices_tensor_Y+box_true[:,:,:,3]) # 1 x S x S + batch x S x S = batch x S x S
  h_restored_gt_tensor=box_true[:,:,:,4]*I_S # batch x S x S
  w_restored_gt_tensor=box_true[:,:,:,5]*I_S # batch x S x S

  # Restored predicted
  cx_restored_pred_tensor=norm_const*(indices_tensor_X+box_pred[:,:,:,B+4*i]) # 1 x S x S + batch x S x S = batch x S x S
  cx_restored_pred_tensor=tf.math.maximum(cx_restored_pred_tensor,0)# To remove negative values
  cy_restored_pred_tensor=norm_const*(indices_tensor_Y+box_pred[:,:,:,B+1+4*i]) # 1 x S x S + batch x S x S = batch x S x S
  cy_restored_pred_tensor=tf.math.maximum(cy_restored_pred_tensor,0)# To remove negative values
  h_restored_pred_tensor=box_pred[:,:,:,B+2+4*i]*I_S # batch x S x S
  h_restored_pred_tensor=tf.math.maximum(h_restored_pred_tensor,0)# To remove negative values
  w_restored_pred_tensor=box_pred[:,:,:,B+3+4*i]*I_S # batch x S x S
  w_restored_pred_tensor=tf.math.maximum(w_restored_pred_tensor,0)# To remove negative values

  # min max of intersection box all, batch x S x S
  x_min_tensor=tf.math.maximum(tf.math.maximum(cx_restored_gt_tensor-w_restored_gt_tensor/2,0),tf.math.maximum(cx_restored_pred_tensor-w_restored_pred_tensor/2,0))
  y_min_tensor=tf.math.maximum(tf.math.maximum(cy_restored_gt_tensor-h_restored_gt_tensor/2,0),tf.math.maximum(cy_restored_pred_tensor-h_restored_pred_tensor/2,0))
  x_max_tensor=tf.math.minimum(cx_restored_gt_tensor+w_restored_gt_tensor/2,cx_restored_pred_tensor+w_restored_pred_tensor/2)
  y_max_tensor=tf.math.minimum(cy_restored_gt_tensor+h_restored_gt_tensor/2,cy_restored_pred_tensor+h_restored_pred_tensor/2)
  w_intersection=tf.math.maximum(x_max_tensor-x_min_tensor,0)
  h_intersection=tf.math.maximum(y_max_tensor-y_min_tensor,0)
  intersection_tensor=w_intersection*h_intersection # batch x S x S
  union_tensor=(w_restored_gt_tensor*h_restored_gt_tensor)+(w_restored_pred_tensor*h_restored_pred_tensor) # batch x S x S
  smooth=1 # We are using smooth because we dont want division by 0
  return (intersection_tensor+smooth)/(union_tensor+smooth) #batch x S x S

以及培训期间要观察的平均iou度量：

def mean_iou_metric(y_true,y_pred):
  mean_iou=0.0
  for i in range(0,B):
    iou_tensor=y_true[:,:,:,0]*return_iou_tensor(y_true,y_pred,i)
    mean_iou=mean_iou+K.mean(iou_tensor)
  return mean_iou/B

型号（基于更快的yolo）：

TLDR；图片：

最新培训参数：

def mean_iou_metric(y_true,y_pred):
  mean_iou=0.0
  for i in range(0,B):
    iou_tensor=y_true[:,:,:,0]*return_iou_tensor(y_true,y_pred,i)
    mean_iou=mean_iou+K.mean(iou_tensor)
  return mean_iou/B

from keras import backend as K
import tensorflow as tf

def custom_activation(x):
  # LEAKY RELU
  isPositive=K.cast(K.greater(x,0),K.floatx()) # U HAVE TO CAST THE OUTPUT OF COMAPARISION TO FLOAT, BOOL NOT ACCEPTED
  # OUTPUT OF THIS FUNCTION IS A TENSOR
  return (isPositive*x)+(1-isPositive)*0.1*x

###############  BLOCK 1 ##############################
input_=Input(shape=(256,256,3),name='input')
#zeropad1=ZeroPadding2D(padding=(3,3))(input_) # PADDING MAKES 448->3+448+3, it is required to bring output to 112

convLayer1=Conv2D(64,(7,7),strides=(2,2),padding='valid',activation=custom_activation,name='conv_layer1')(input_)
maxpoolLayer1=MaxPooling2D(pool_size=(2,2),name='max_pool_layer1')(convLayer1)
#zeropad2=ZeroPadding2D(padding=(1,1))(maxpoolLayer1)
########################################################

###############  BLOCK 2 ##############################
convLayer2=Conv2D(192,(3,3),padding='valid',activation=custom_activation,name='conv_layer2')(maxpoolLayer1)
maxpoolLayer2=MaxPooling2D(pool_size=(2,2),name='max_pool_layer2')(convLayer2)
#zeropad3=ZeroPadding2D(padding=(2,2))(maxpoolLayer2)
########################################################

###############  BLOCK 3 ##############################
convLayer3=Conv2D(128,(1,1),padding='valid',activation=custom_activation,name='conv_layer3')(maxpoolLayer2)
convLayer4=Conv2D(256,(3,3),padding='valid',activation=custom_activation,name='conv_layer4')(convLayer3)
#convLayer5=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer5')(convLayer4)
#convLayer6=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer6')(convLayer5)
maxpoolLayer3=MaxPooling2D(pool_size=(2,2),name='max_pool_layer3')(convLayer4)
#zeropad4=ZeroPadding2D(padding=(5,5))(maxpoolLayer3)
########################################################

###############  BLOCK 4 ##############################
convLayer7=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer7')(maxpoolLayer3)
convLayer8=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer8')(convLayer7)
#convLayer9=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer9')(convLayer8)
#convLayer10=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer10')(convLayer9)
#convLayer11=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer11')(convLayer10)
#convLayer12=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer12')(convLayer11)
#convLayer13=Conv2D(256,(1,1),padding='valid',activation=custom_activation,name='conv_layer13')(convLayer12)
#convLayer14=Conv2D(512,(3,3),padding='valid',activation=custom_activation,name='conv_layer14')(convLayer13)
#convLayer15=Conv2D(512,(1,1),padding='valid',activation=custom_activation,name='conv_layer15')(convLayer14)
#convLayer16=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer16')(convLayer15)
maxpoolLayer4=MaxPooling2D(pool_size=(2,2),name='max_pool_layer4')(convLayer8)
#zeropad5=ZeroPadding2D(padding=(4,4))(maxpoolLayer4)
###########################################################

###############  BLOCK 5 ##################################
convLayer17=Conv2D(512,(1,1),padding='valid',activation=custom_activation,name='conv_layer17')(maxpoolLayer4)
convLayer18=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer18')(convLayer17)
#convLayer19=Conv2D(512,(1,1),padding='valid',activation=custom_activation,name='conv_layer19')(convLayer18)
#convLayer20=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer20')(convLayer19)
#convLayer21=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer21')(convLayer20)
#convLayer22=Conv2D(1024,(3,3),strides=(2,2),padding='valid',activation=custom_activation,name='conv_layer22')(convLayer21)
#zeropad6=ZeroPadding2D(padding=(2,2))(convLayer18)
#############################################################

################ BLOCK 6 ####################################
convLayer23=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer23')(convLayer18)
#convLayer24=Conv2D(1024,(3,3),padding='valid',activation=custom_activation,name='conv_layer24')(convLayer23)
flattenedLayer1=Flatten()(convLayer23) # Flatten just converts 3d matrix to 1d so that it can be connected to a next Dense Layer
###############################################################

################ BLOCK 7 #########################################
denseLayer1=Dense(units=4096,activation=custom_activation)(flattenedLayer1)
##################################################################

################ BLOCK 8 #########################################
denseLayer2=Dense(units=S*S*(5*B+C),activation='linear')(denseLayer1)
output_=Reshape((S,S,(5*B+C)))(denseLayer2) # Reshapes the 1D to 3D
##################################################################

fast_model=Model(inputs=input_,outputs=output_)
fast_model.summary()
from keras.utils import plot_model
plot_model(fast_model,to_file='unet.png',show_shapes=True)

model=fast_model # SELECT MODEL
model.save_weights('weights.hdf5')
model.compile(optimizer=Adam(learning_rate=1e-5),loss=yolo_loss_trial,metrics=[mean_iou_metric])
#model.compile(optimizer=SGD(learning_rate=1e-5),loss=yolo_loss_trial,metrics=[mean_iou_metric])
model.load_weights('weights.hdf5')
checkpointer = callbacks.ModelCheckpoint(filepath = 'weights.hdf5',save_best_only=True)
training_log = callbacks.TensorBoard(log_dir='./Model_logs')
reduce_lr = ReduceLROnPlateau(monitor='loss', factor=0.2,patience=3, min_lr=1e-5,mode='auto') # ADD IN CALLBACK in fit()
# patience is after how many epochs if improvement is not seen, then reduce lr, newlr=lr*factor

for i in range(0,10):
  print("Iteration, ",i)
  history=model.fit(X_train,Y_train,validation_data=(X_val,Y_val),batch_size=16,epochs=32,callbacks=[training_log,checkpointer,reduce_lr],shuffle=True)
  # SAVE MODEL TO DRIVE
  !cp '/content/weights.hdf5' 'gdrive/My Drive/Colab Notebooks/Colab Datasets/Breast_Cancer_HNS/Images'
  # CONFIRM EXECUTION TIMESTAMP
  from datetime import datetime
  import pytz
  tz = pytz.timezone('Asia/Calcutta')
  berlin_now = datetime.now(tz)
  dt_string = berlin_now.strftime("%d/%m/%Y %H:%M:%S")
  print(dt_string)
  #from google.colab import output
  #output.eval_js('new Audio("https://ssl.gstatic.com/dictionary/static/sounds/20180430/complete--_us_1.mp3").play()')
# SAVE WHOLE MODEL TO LOCAL/COLAB DRIVE
model.save("FastYolo") #Saves weights also according to official docs
# SAVE MODEL TO GOOGLE DRIVE
!cp '/content/FastYolo' '/content/gdrive/My Drive/Colab Notebooks/Colab Datasets/Shape_Detection_YOLO/None2500NoOverlap'