Python TensorFlow：如何编写多步衰变_Python_Tensorflow

Python TensorFlow：如何编写多步衰变

python tensorflow

Python TensorFlow：如何编写多步衰变,python,tensorflow,Python,Tensorflow,Caffe中存在多步衰变。它计算为base_lr*gamma^（地板（台阶））其中step在每个衰减步骤后递增。例如，对于[100200]衰变步骤和全局步骤=101我想要得到基本的伽马^1，对于全局步骤=201和更多，我想要得到基本的伽马^2等等我试图根据指数衰减源实现它，但我无能为力。下面是指数衰减（）的代码：我必须通过decation\u步骤作为某种数组-python数组或张量。此外，我必须（？）通过当前衰变步骤（步骤，在上述公式中）第一个选项：在没有张量的纯python中，它非常简单

Caffe中存在多步衰变。它计算为

base_lr*gamma^（地板（台阶））

其中

step

在每个衰减步骤后递增。例如，对于

[100200]

衰变步骤和

全局步骤=101

我想要得到

基本的伽马^1

，对于

全局步骤=201

和更多，我想要得到

基本的伽马^2

等等

我试图根据指数衰减源实现它，但我无能为力。下面是指数衰减（）的代码：

我必须通过

decation\u步骤

作为某种数组-python数组或张量。此外，我必须（？）通过

当前衰变步骤

（

步骤

，在上述公式中）

第一个选项：在没有张量的纯python中，它非常简单：

decay_steps.append(global_step)
p = sorted(decay_steps).index(global_step) # may be there must be `+1` or `-1`. I hope that main idea is clear

我做不到，因为TF中没有排序。我不知道实施它需要多少时间

第二个选项：类似下面的代码。它不起作用的原因有很多。首先，我不知道如何在

tf.cond

中将参数传递给Function。第二，即使我通过ARG，它也可能不起作用：

第三个选项：它将不起作用，因为我无法使用

张量[另一个\u张量]

获取元素

    # if len(decay_steps) > (current_step + 1):
    #    if global_step > decay_steps[current_step + 1]:
    #        current_step += 1


    current_decay_step = tf.cond(tf.greater(tf.shape(current_decay_step)[0], tf.add(current_decay_step,1)),
                                 tf.cond(tf.greater(global_step, decay_steps[tf.add(current_decay_step + 1]), tf.add(current_decay_step,1), tf.add(current_decay_step,0)),
                                 tf.add(current_decay_step, 0)

我能做什么

UPD:我几乎可以用第二个选项来完成

我可以

   def nothing: return tf.no_op()
   tf.cond(tf.greater(global_step, decay_steps[0]),
                    functools.partial(new_decay_step, decay_steps),
                    nothing)

但是由于某种原因，inner

tf.cond

不起作用

对于此代码，我得到错误

fn1必须是可调用的

   def nothing: return tf.no_op()
   tf.cond(tf.greater(tf.shape(decay_steps)[0], 0),
            tf.cond(tf.greater(global_step, decay_steps[0]),
                    functools.partial(new_decay_step, decay_steps),
                    nothing),
            nothing)

UPD2:internal

tf.cond

将不起作用，因为它们返回张量，args必须是函数

我没有检查它，但似乎它可以工作（至少它不会因错误而崩溃）：

UPD3:我意识到UPD2中的代码无法工作，因为我无法更改函数中的列表

我也不知道真正执行的是

tf.logical_和的哪些部分
我编写了以下代码：
class ohmy:
    def __init__(self, decay_steps):
        self.decay_steps = decay_steps

    def multistep_decay(self, learning_rate, global_step, current_decay_step, decay_steps, decay_rate,
                    staircase=False, name=None):

        learning_rate = tf.convert_to_tensor(learning_rate, name="learning_rate")
        dtype = learning_rate.dtype
        global_step = tf.cast(global_step, dtype)

        decay_rate = tf.cast(decay_rate, dtype)

        def new_step():
            self.decay_steps = self.decay_steps[1:]
            current_decay_step.assign(current_decay_step + 1)
            return current_decay_step

        def curr_step():
            return current_decay_step

        current_decay_step = tf.cond(tf.logical_and(tf.greater(tf.shape(self.decay_steps)[0], 0),  tf.greater(global_step, self.decay_steps[0])),
                new_step,
                curr_step)

        a = tf.Print(global_step, [global_step], "global")
        b = tf.Print(self.decay_steps, [self.decay_steps], "decay_steps")
        c = tf.Print(current_decay_step, [current_decay_step], "step")

        with tf.control_dependencies([a, b, c, current_decay_step]):
            p = current_decay_step

            if staircase:
                p = tf.floor(p)

            return tf.mul(learning_rate, tf.pow(decay_rate, p), name=name)


decay_steps = [3,4,5,6,7]
decay_steps = tf.convert_to_tensor(decay_steps, dtype=tf.float32)
current_decay_step = tf.Variable(0.0, trainable=False)
global_step = tf.Variable(0, trainable=False)
decay_rate = 0.5

c=ohmy(decay_steps)
lr = ohmy.multistep_decay(c, 0.010, global_step, current_decay_step, decay_steps, decay_rate)
#lr = tf.train.exponential_decay(0.001, global_step=global_step, decay_steps=2, decay_rate=0.5, staircase=True)
tf.scalar_summary('learning_rate', lr)

opt = tf.train.AdamOptimizer(lr)
#...train loop and so on

这根本不起作用。以下是输出：
I tensorflow/core/kernels/logging_ops.cc:79] step[0]
I tensorflow/core/kernels/logging_ops.cc:79] global[0]
E tensorflow/core/client/tensor_c_api.cc:485] The tensor returned for MergeSummary/MergeSummary:0 was not valid.
Traceback (most recent call last):
  File "flownet_new.py", line 528, in <module>
    summary_str = sess.run(summary_op)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 382, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 655, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 723, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 743, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: The tensor returned for MergeSummary/MergeSummary:0 was not valid.

I tensorflow/core/kernels/logging_ops.cc:79]步骤[0]
I tensorflow/core/kernels/logging_ops.cc:79]global[0]
E tensorflow/core/client/tensor_c_api.cc:485]为MergeSummary/MergeSummary:0返回的张量无效。
回溯（最近一次呼叫最后一次）：
文件“flownet_new.py”，第528行，在
summary\u str=sess.run（summary\u op）
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/client/session.py”，第382行，正在运行
运行_元数据_ptr）
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”，第655行，正在运行
提要（dict字符串、选项、运行元数据）
文件“/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py”，第723行，运行
目标\u列表、选项、运行\u元数据）
文件“/usr/local/lib/python2.7/dist packages/tensorflow/python/client/session.py”，第743行，在
提升类型（e）（节点定义、操作、消息）
tensorflow.python.framework.errors.InvalidArgumentError:为MergeSummary/MergeSummary:0返回的张量无效。

如您所见，没有衰减步骤的输出。我甚至不能调试它
现在我肯定不知道如何用一个函数来实现它。
顺便说一句，要么我做错了什么，要么tf.contrib.slim
对学习率衰减不起作用
目前，最简单的解决方案就是按照上面所说的，在列车循环中制造你们想要的东西。
使用，这正是你们想要的。衰减的学习率计算如下：
decayed_learning_rate = learning_rate *
                    decay_rate ^ (global_step / decay_steps)

请注意，decation\u steps
参数是一个整数（不是数组或张量），用于保存学习率变化的迭代周期。在您的示例中，decation\u steps=100
您可以尝试使用案例
，开关
，以及合并

例如，假设base\u lr
为0.1
，而gamma
为0.1
，则可以使用
import tensorflow as tf
from tensorflow.python.ops import control_flow_ops

global_step = global_step = tf.placeholder(dtype=tf.int64)

learning_rate = tf.case(
    [(tf.less(global_step, 100), lambda: tf.constant(0.1)),
     (tf.less(global_step, 200), lambda: tf.constant(0.01))],
    default=lambda: tf.constant(0.001))

with tf.Session() as sess:
    print(sess.run(learning_rate, {global_step: 0}))   # 0.1
    print(sess.run(learning_rate, {global_step: 1}))   # 0.1
    print(sess.run(learning_rate, {global_step: 99}))  # 0.1
    print(sess.run(learning_rate, {global_step: 100})) # 0.01
    print(sess.run(learning_rate, {global_step: 101})) # 0.01
    print(sess.run(learning_rate, {global_step: 199})) # 0.01
    print(sess.run(learning_rate, {global_step: 200})) # 0.001
    print(sess.run(learning_rate, {global_step: 201})) # 0.001

或
该代码使用tensorflow 0.12.1进行了测试，我在tensorflow中寻找该功能，发现使用tf.train.pieclewise_常量可以轻松实现。下面是tensorflow的api_文档中的一个示例：（）
示例：前100000个步骤的学习率为1.0，100001到110000个步骤的学习率为0.5，其他步骤的学习率为0.1
global_step = tf.Variable(0, trainable=False)
boundaries = [100000, 110000]
values = [1.0, 0.5, 0.1]
learning_rate = tf.train.piecewise_constant(global_step, boundaries, values)

之后，无论何时执行优化步骤，我们都会增加全局_步骤
 请参阅Stackoverflow问题#33919948：您可以简单地将您的学习率设置为一个变量，然后您可以使用assign_op将其指定给您想要的任何值（例如，在无张量代码中计算），谢谢。是的，如果我将人工建造列车环路，我可以这样做。但是我不确定我是否可以使用tf.contrib.slim.learning.train
（请参阅）。抱歉，url不正确。这里有文档：我不熟悉tf.contrib.slim
。但看起来您在这里定义了优化器（如AdaGrad）：def train\u step（sess、train\u op、global\u step、train\u step\u kwargs）
，在global\u step中。由于您可以将任何优化器的学习率初始化为TF变量，因此您可以这样初始化它，然后将其传递到该函数中（该函数由train调用，它在train\u step\u kwargs=\u USE\u DEFAULT）中接受这些参数）。对不起，我尝试实现的net的作者使用的衰减间隔不相等。因为这个原因，我讲了张量或数组。例如，使用此函数，在不改变学习速率的情况下，不能在100、200、300之后衰减并继续到1000。我几乎可以肯定，我可以改变其他超参数和网络结构，并得到良好的训练，但首先我想检查他们的方式。
decayed_learning_rate = learning_rate *
                    decay_rate ^ (global_step / decay_steps)

import tensorflow as tf
from tensorflow.python.ops import control_flow_ops

global_step = global_step = tf.placeholder(dtype=tf.int64)

learning_rate = tf.case(
    [(tf.less(global_step, 100), lambda: tf.constant(0.1)),
     (tf.less(global_step, 200), lambda: tf.constant(0.01))],
    default=lambda: tf.constant(0.001))

with tf.Session() as sess:
    print(sess.run(learning_rate, {global_step: 0}))   # 0.1
    print(sess.run(learning_rate, {global_step: 1}))   # 0.1
    print(sess.run(learning_rate, {global_step: 99}))  # 0.1
    print(sess.run(learning_rate, {global_step: 100})) # 0.01
    print(sess.run(learning_rate, {global_step: 101})) # 0.01
    print(sess.run(learning_rate, {global_step: 199})) # 0.01
    print(sess.run(learning_rate, {global_step: 200})) # 0.001
    print(sess.run(learning_rate, {global_step: 201})) # 0.001

import tensorflow as tf
from tensorflow.python.ops import control_flow_ops

global_step = global_step = tf.placeholder(dtype=tf.int64)

learning_rate = control_flow_ops.merge(
    [control_flow_ops.switch(tf.constant(0.1), 
                             tf.less(global_step, 100))[1],
     control_flow_ops.switch(tf.constant(0.01), 
                             tf.logical_and(tf.greater_equal(global_step, 100),
                                            tf.less(global_step, 200)))[1],
     control_flow_ops.switch(tf.constant(0.001), 
                             tf.greater_equal(global_step, 200))[1]])[0]

with tf.Session() as sess:
    print(sess.run(learning_rate, {global_step: 0}))   # 0.1
    print(sess.run(learning_rate, {global_step: 1}))   # 0.1
    print(sess.run(learning_rate, {global_step: 99}))  # 0.1
    print(sess.run(learning_rate, {global_step: 100})) # 0.01
    print(sess.run(learning_rate, {global_step: 101})) # 0.01
    print(sess.run(learning_rate, {global_step: 199})) # 0.01
    print(sess.run(learning_rate, {global_step: 200})) # 0.001
    print(sess.run(learning_rate, {global_step: 201})) # 0.001

global_step = tf.Variable(0, trainable=False)
boundaries = [100000, 110000]
values = [1.0, 0.5, 0.1]
learning_rate = tf.train.piecewise_constant(global_step, boundaries, values)