Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/335.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 张量流概率-双射体训练_Python_Tensorflow_Tensorflow Probability - Fatal编程技术网

Python 张量流概率-双射体训练

Python 张量流概率-双射体训练,python,tensorflow,tensorflow-probability,Python,Tensorflow,Tensorflow Probability,我一直试图从这个例子,但我有困难的训练任何变量 我写了一个小例子,但我也没能做到: # Train a shift bijector shift = tf.Variable(initial_value=tf.convert_to_tensor([1.0], dtype=tf.float32), trainable=True, name='shift_var') bijector = tfp.bijectors.Shift(shift=shift) # Input x = tf.convert_

我一直试图从这个例子,但我有困难的训练任何变量

我写了一个小例子,但我也没能做到:

# Train a shift bijector
shift = tf.Variable(initial_value=tf.convert_to_tensor([1.0], dtype=tf.float32), trainable=True, name='shift_var')
bijector = tfp.bijectors.Shift(shift=shift)

# Input
x = tf.convert_to_tensor(np.array([0]), dtype=tf.float32)
target = tf.convert_to_tensor(np.array([2]), dtype=tf.float32)

optimizer = tf.optimizers.Adam(learning_rate=0.5)
nsteps = 1

print(bijector(x).numpy(), bijector.shift)
for _ in range(nsteps):

    with tf.GradientTape() as tape:
        out = bijector(x)
        loss = tf.math.square(tf.math.abs(out - target))
        #print(out, loss)
    
        gradients = tape.gradient(loss, bijector.trainable_variables)
    
    optimizer.apply_gradients(zip(gradients, bijector.trainable_variables))
    
print(bijector(x).numpy(), bijector.shift)
对于nsteps=1,两条print语句将产生以下输出:

[1.] <tf.Variable 'shift_var:0' shape=(1,) dtype=float32, numpy=array([1.], dtype=float32)>
[1.] <tf.Variable 'shift_var:0' shape=(1,) dtype=float32, numpy=array([1.4999993], dtype=float32)>

我正在使用

tensorflow version 2.3.0
tensorflow-probability version 0.11.0

我也在colab笔记本上试用过,所以我怀疑这是版本问题。

仍然不确定我是否完全理解这里发生的事情,但至少我现在可以让我的示例工作了

出于某种原因,如果我将其封装在继承自tf.keras.Model的类中,其行为会有所不同:

class BijectorModel(tf.keras.Model):

    def __init__(self):
        super().__init__()

        self.shift = tf.Variable(initial_value=tf.convert_to_tensor([1.5], dtype=tf.float32), trainable=True, name='shift_var')
        self.bijector = tfp.bijectors.Shift(shift=self.shift)

    def call(self, input):
        return self.bijector(input)

我为培训迭代创建了一个函数,尽管这似乎不是必需的:

def training_iteration(model, input, target):

    optimizer = tf.optimizers.SGD(learning_rate=0.1)

    with tf.GradientTape() as tape:

        loss = tf.math.square(tf.math.abs(model(input) - target))

        gradients = tape.gradient(loss, model.trainable_variables)

    optimizer.apply_gradients(zip(gradients, model.trainable_variables))

像这样执行

x = tf.convert_to_tensor(np.array([0]), dtype=tf.float32)
target = tf.convert_to_tensor(np.array([2]), dtype=tf.float32)
model = BijectorModel()

nsteps = 10
for _ in range(nsteps):
    training_iteration(model, x, target)
    print('Iteration {}: Output {}'.format(_, model(x)))
产生预期/期望输出:

Iteration 0: Output [1.6]
Iteration 1: Output [1.6800001]
Iteration 2: Output [1.7440001]
Iteration 3: Output [1.7952001]
Iteration 4: Output [1.8361601]
Iteration 5: Output [1.8689281]
Iteration 6: Output [1.8951424]
Iteration 7: Output [1.916114]
Iteration 8: Output [1.9328911]
Iteration 9: Output [1.9463129]

我的结论是,与通过bijector对象访问相比,作为模型的一部分时,可训练变量的处理方式有所不同。

您发现了一个bug。bijector forward函数弱缓存结果->输入映射,以快速进行下游逆运算和对数行列式运算。但不知何故,这也干扰了梯度。解决方法是添加一个
del out
,如中所示

Iteration 0: Output [1.6]
Iteration 1: Output [1.6800001]
Iteration 2: Output [1.7440001]
Iteration 3: Output [1.7952001]
Iteration 4: Output [1.8361601]
Iteration 5: Output [1.8689281]
Iteration 6: Output [1.8951424]
Iteration 7: Output [1.916114]
Iteration 8: Output [1.9328911]
Iteration 9: Output [1.9463129]