Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/loops/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Tensorflow 使用tf.train.MonitoredTrainingSession获得验证丢失的干净方法是什么?_Tensorflow_Distributed - Fatal编程技术网

Tensorflow 使用tf.train.MonitoredTrainingSession获得验证丢失的干净方法是什么?

Tensorflow 使用tf.train.MonitoredTrainingSession获得验证丢失的干净方法是什么?,tensorflow,distributed,Tensorflow,Distributed,我正在构建一个分布式张量流模型,对于如何以干净的方式使用tf.MonitoredTrainingSession,我有点困惑 这是我的培训代码: #Define number of training steps hooks=[tf.train.StopAtStepHook(last_step=FLAGS.nb_train_step)] with tf.train.MonitoredTrainingSession(master=target, is_chief=(FLAGS.task_in

我正在构建一个分布式张量流模型,对于如何以干净的方式使用tf.MonitoredTrainingSession,我有点困惑

这是我的培训代码:

#Define number of training steps
hooks=[tf.train.StopAtStepHook(last_step=FLAGS.nb_train_step)]

with tf.train.MonitoredTrainingSession(master=target,
    is_chief=(FLAGS.task_index == 0),
    checkpoint_dir=FLAGS.logs_dir,
    hooks = hooks) as sess:

    while not sess.should_stop():
        batch_train = gen_train.next() #training data generator

        feed_dict = {X: batch_train[0],
                        Y: batch_train[1]}

        variables = [loss, merged_summary, train_step]
        current_loss, summary,  _ = sess.run(variables, feed_dict)
        print("Batch loss: %s" % current_loss)
现在,如果我想在每个
n
训练步骤中得到我的模型验证损失,我可以在每个
n
步骤中添加要评估的块:

batch_val = gen_val.next() #validation data generator
feed_dict = {X: batch_train[0],
            Y: batch_train[1]}

val_loss = sess.run([loss],feed_dict)
但这将增加我的钩子中的步骤数,这意味着验证损失计算将被视为一个训练步骤。有没有干净的方法?我是否误解了钩子的作用