Python Tensorflow:巨大的损失函数值输出

Python Tensorflow:巨大的损失函数值输出,python,computer-vision,tensorflow,deep-learning,Python,Computer Vision,Tensorflow,Deep Learning,我想用Tensorflow实现。 但我写了损耗函数并训练了网络,损耗值非常大,如下所示: 2016-09-11 13:55:03.753679: step 0, loss = 3371113119744.00 (3.9 examples/sec; 2.548 sec/batch) 2016-09-11 13:55:14.444871: step 10, loss = nan (19.8 examples/sec; 0.505 sec/batch) 0步骤已经非常大,其他步骤都变成了nan值

我想用Tensorflow实现。 但我写了损耗函数并训练了网络,损耗值非常大,如下所示:

2016-09-11 13:55:03.753679: step 0, loss = 3371113119744.00 (3.9 examples/sec; 2.548 sec/batch)  
2016-09-11 13:55:14.444871: step 10, loss = nan (19.8 examples/sec; 0.505 sec/batch)
0步骤已经非常大,其他步骤都变成了nan值。 我想不出原因是什么 这是我写的关于YOLO的损失函数:

def inference_loss(y_out, y_true):
'''
Args:
  y_true: Ground Truth output
  y_out: Predicted output
  The form of the ground truth vector is:
  ######################################
  ##1225 values in total: 7*7=49 cells 
  ##each cell vector has 25 values: bounding box (x,y,h,w), class one hot vector (p1,p2,...,p20), objectness score (0 or 1)##
  ##49 * 25 = 1225 
  ######################################

Returns:
  The loss caused by y_out
'''    
lambda_coor = 5
lambda_noobj = 0.5

box_loss = 0.0
score_loss = 0.0
class_loss = 0.0

for i in range(49):
    #the first bounding box
    y_out_box1 = y_out[:,i*30:i*30+4]
    #the second bounding box
    y_out_box2 = y_out[:,i*30+4:i*30+8]
    #ground truth bounding box
    y_true_box = y_true[:,i*25:i*25+4]
    #l2 loss of the predicted bounding box
    box_loss_piece = tf.reduce_sum(tf.square(y_true_box - y_out_box1), 1) + tf.reduce_sum(tf.square(y_true_box - y_out_box2), 1)
    #bounding box loss
    box_loss_piece = box_loss_piece * lambda_coor * y_true[:,i*25+24]

    box_loss = box_loss + box_loss_piece 
    #predicted score
    y_out_score1 = y_out[:,i*30+8]
    y_out_score2 = y_out[:,i*30+9]
    #ground truth score
    y_true_score = y_true[:,i*25+24]
    #the first score
    score_loss1_piece = tf.square(y_true_score - y_out_score1) + tf.square(y_true_score - y_out_score2)
    #the second score
    score_loss2_piece = lambda_noobj * score_loss1_piece
    #score loss
    score_loss1_piece = score_loss1_piece * y_true[:,i*25+24]
    score_loss2_piece = score_loss2_piece * (1 - y_true[:,i*25+24]) 

    score_loss = score_loss + score_loss1_piece + score_loss2_piece
    #one hot predicted class vector and ground truth vector
    y_out_class = y_out[:,i*30+10:(i+1)*30]
    y_true_class = y_true[:,i*25+4:i*25+24]
    # class loss
    class_loss_piece = tf.reduce_sum(tf.square(y_true_class - y_out_class), 1)
    class_loss = class_loss + class_loss_piece * y_true[:,i*25+24]

#total loss of one batch
loss = tf.reduce_sum(box_loss+score_loss+class_loss, 0)
return loss
这是我写的培训代码:

def train_test():
    with tf.Graph().as_default():
        global_step = tf.Variable(0, trainable=False)

        data_batch_generator = yolo_inputs.generate_batch_data(voclabelpath, imagenamefile, BATCH_NUM, sample_number=10000, iteration = 5000)

        training_image_batch = tf.placeholder(tf.float32, shape = [BATCH_NUM, 448, 448, 3])
        training_label_batch = tf.placeholder(tf.float32, shape = [BATCH_NUM, 1225])

        #inference and loss
        yolotinyinstance = yolo_tiny.YOLO()
        yolotinyinstance.build(training_image_batch)        
        net_out = yolotinyinstance.fc12

        loss = inference_loss(net_out, training_label_batch)          

        train_op = train(loss, global_step)

        saver = tf.train.Saver(tf.all_variables())

        summary_op = tf.merge_all_summaries()

        init = tf.initialize_all_variables()

        sess = tf.Session()
        sess.run(init)          

        summary_writer = tf.train.SummaryWriter(TRAIN_DIR, sess.graph)

        step = 0

        for x,y in data_batch_generator:
            start_time = time.time()

            _, loss_value = sess.run([train_op, loss], feed_dict = {training_image_batch: x, 
                                     training_label_batch:y})
这使我一时糊涂。有人能帮忙吗?
非常感谢。

你的学习率是多少?当学习率过高时,可以观察到这种行为。您是否使用标准化坐标[0.0,1.0]?YOLO和其他单次拍摄方法通过图像宽度和高度进行标准化,这有助于将损失、梯度等保持在可行范围内。谢谢。输入图像已经标准化为[-1,1],我检查了网络的输出,最后一层输出非常大,大约10e4。但对于前几个conv层,输出不是很大,大约为1e-1。我正在检查问题出在哪里。