Python 如何控制何时使用tensorflow的估计器API计算评估与培训？_Python_Tensorflow

Python 如何控制何时使用tensorflow的估计器API计算评估与培训？

python tensorflow

Python 如何控制何时使用tensorflow的估计器API计算评估与培训？,python,tensorflow,Python,Tensorflow,如下列文件所述： tensorflow文档未提供如何在评估集上定期评估模型的任何示例被接受的答案建议使用实验（这是不赞成的根据）我在网上找到的所有关于使用这种方法的观点。但是，我仍然不知道如何在这两个过程（培训和评估）之间切换。我尝试了以下方法： estimator = tf.estimator.Estimator( model_fn=model_fn, params=hparams, model_dir=model_dir, config = tf.esti

如下列文件所述：

tensorflow文档未提供如何在评估集上定期评估模型的任何示例

被接受的答案建议使用实验（这是不赞成的根据）

我在网上找到的所有关于使用这种方法的观点。但是，我仍然不知道如何在这两个过程（培训和评估）之间切换。我尝试了以下方法：

estimator = tf.estimator.Estimator(
    model_fn=model_fn,
    params=hparams,
    model_dir=model_dir,
    config = tf.estimator.RunConfig(
        save_checkpoints_steps = 2000,
        save_summary_steps = 100,
        keep_checkpoint_max=5
    )
)

train_input_fn = lambda: input_fn(
    train_file, #a .tfrecords file
    train=True,
    batch_size=70,
    num_epochs=100
)

eval_input_fn = lambda: input_fn(
    val_file, # another .tfrecords file
    train=False,
    batch_size=70,
    num_epochs=1
)
train_spec = tf.estimator.TrainSpec(
    train_input_fn,
    max_steps=125
)    

eval_spec = tf.estimator.EvalSpec(
    eval_input_fn,
    steps=30,
    name='validation',
    start_delay_secs=150,
    throttle_secs=200
)

tf.logging.info("start experiment...")
tf.estimator.train_and_evaluate(
    estimator,
    train_spec,
    eval_spec
)

以下是我认为我的代码应该做的：

使用批量大小为70的批次，对模型进行100个时代的培训；每2000批保存一次检查点；每100批保存一次摘要；不超过5个检查站；在训练集中完成150批后，使用30批验证数据计算验证错误

但是，我得到以下日志：

INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 1 into /output/model.ckpt.
INFO:tensorflow:loss = 39.55082, step = 1
INFO:tensorflow:global_step/sec: 178.622
INFO:tensorflow:loss = 1.0455043, step = 101 (0.560 sec)
INFO:tensorflow:Saving checkpoints for 150 into /output/model.ckpt.
INFO:tensorflow:Loss for final step: 0.8327793.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2018-04-02-22:49:15
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /projects/MNIST-GCP/output/model.ckpt-150
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [3/30]
INFO:tensorflow:Evaluation [6/30]
INFO:tensorflow:Evaluation [9/30]
INFO:tensorflow:Evaluation [12/30]
INFO:tensorflow:Evaluation [15/30]
INFO:tensorflow:Evaluation [18/30]
INFO:tensorflow:Evaluation [21/30]
INFO:tensorflow:Evaluation [24/30]
INFO:tensorflow:Evaluation [27/30]
INFO:tensorflow:Evaluation [30/30]
INFO:tensorflow:Finished evaluation at 2018-04-02-22:49:15
INFO:tensorflow:Saving dict for global step 150: accuracy = 0.8552381, global_step =150, loss = 0.95031387

从日志中可以看出，培训似乎在第一个评估步骤之后停止。我在文档中遗漏了什么？你能解释一下我应该如何实现我认为我的代码正在做的事情吗

附加信息我正在使用MNIST数据集运行一切，它在训练集中有50000个图像，所以（我认为）模型应该运行*num_epochs*50000/批大小≃ 7000个步骤*

我真诚地感谢你的帮助

编辑：在运行实验后，我意识到max_steps控制整个训练过程的步骤数，而不仅仅是在计算测试集上的度量之前的步骤数。阅读tf.estimator.estimator.train，我看到它有一个steps参数，它以递增方式工作，并以max_步长为界；但是，tf.estimator.TrainSpec没有steps参数，这意味着我无法控制在计算验证集上的度量之前要采取的步骤数。

事实上，每200秒或当您的培训结束时，估计器将从培训阶段切换到评估阶段

但是，我们可以从您的代码中看到，您能够在评估开始之前完成125个步骤，这意味着您的培训已经完成。max_steps是停止前重复训练的次数，与历元数有任何联系（因为它不在tf.estimator.train_和_evaluate中使用）。在您的培训期间，您的评估指标将在每秒钟（=200）出现

关于您可以在模型中添加以下指标：

predict = tf.nn.softmax(logits, name="softmax_tensor")
classes = tf.cast(tf.argmax(predict, 1), tf.uint8)

def conv_model_eval_metrics(classes, labels, mode):
    if mode == tf.estimator.ModeKeys.TRAIN or mode == tf.estimator.ModeKeys.EVAL:
        return {
            'accuracy': tf.metrics.accuracy(classes, labels),
            'precision': tf.metrics.precision(classes, labels),
            'recall': tf.metrics.recall(classes, labels),
        }
    else:
        return None

eval_metrics = conv_model_eval_metrics(classes, labels, mode)
with tf.variable_scope("performance_metrics"):
    #Accuracy is the most intuitive performance measure and it is simply a
        #ratio of correctly predicted observation to the total observations.
    tf.summary.scalar('accuracy', eval_metrics['accuracy'][1])

    #How many selected items are relevant
    #Precision is the ratio of correctly predicted positive observations to
        #the total predicted positive observations.
    tf.summary.scalar('precision', eval_metrics['precision'][1])

    #How many relevant items are selected
    #Recall is the ratio of correctly predicted positive observations to
        #the all observations in actual class
    tf.summary.scalar('recall', eval_metrics['recall'][1])

在您的培训和评估过程中，跟踪tensorboard的精确性、召回率和准确性非常有效

PS：对不起，这是我的第一个答案，这就是为什么读它很恶心的原因^^ ^

可以通过输入中设置的tf.data.Dataset.repeat（num_epochs）来控制重复。训练功能将运行，直到消耗了历元数，然后评估功能将运行，然后训练功能将再次运行，直到消耗了历元数，依此类推；最后，当达到TrainSpec中定义的最大步数时，train_和_eval方法将停止

这是我从几个实验中得出的结论，欢迎更正

根据我的理解，评估是使用最新检查点的更新模型进行的。在您的情况下，直到2000个步骤，您才保存检查点。您还指示

max_steps=125

，这将优先于您为模型提供的数据集

因此，即使您指示批次大小为70和100个纪元，您的模型仍在125个步骤停止训练，这远低于2000个步骤的检查点限制，这反过来又限制了评估，因为评估取决于检查点模型

注意：默认情况下，每次保存检查点时都会进行求值，前提是您没有设置

节流\u secs

限制。

谢谢您的回答！虽然它很有用，但它不能回答这个问题。我将发布我认为是我所做的一些实验的答案