Python Can';t在Tensorflow v2中创建不可训练的变量

Python Can';t在Tensorflow v2中创建不可训练的变量,python,tensorflow,tensorflow2.0,Python,Tensorflow,Tensorflow2.0,我手动实现batchnormalize层。但是初始函数中创建不可训练变量的代码似乎不起作用。代码: import tensorflow as tf class batchNormalization(tf.keras.layers.Layer): def __init__(self, shape, Trainable, **kwargs): super(batchNormalization, self).__init__(**kwargs) self.sha

我手动实现
batchnormalize
层。但是初始函数中创建
不可训练变量的代码似乎不起作用。代码:

import tensorflow as tf
class batchNormalization(tf.keras.layers.Layer):
    def __init__(self, shape, Trainable, **kwargs):
        super(batchNormalization, self).__init__(**kwargs)
        self.shape = shape
        self.Trainable = Trainable
        self.beta = tf.Variable(initial_value=tf.zeros(shape), trainable=Trainable)
        self.gamma = tf.Variable(initial_value=tf.ones(shape), trainable=Trainable)
        self.moving_mean = tf.Variable(initial_value=tf.zeros(self.shape), trainable=False)
        self.moving_var = tf.Variable(initial_value=tf.ones(self.shape), trainable=False)

    def update_var(self,inputs):
        wu, sigma = tf.nn.moments(inputs, axes=[0, 1, 2], shift=None, keepdims=False, name=None)
        var = tf.math.sqrt(sigma)
        self.moving_mean = self.moving_mean * 0.09 + wu * 0.01
        self.moving_var = self.moving_var * 0.09 + var * 0.01
        return wu,var

    def call(self, inputs):
        wu, var = self.update_var(inputs)
        return tf.nn.batch_normalization(inputs, wu, var, self.beta,
                                         self.gamma, variance_epsilon=0.001)


@tf.function
def train_step(model, inputs, label,optimizer):
    with tf.GradientTape(persistent=False) as tape:
        predictions = model(inputs, training=1)
        loss = tf.keras.losses.mean_squared_error(predictions,label)
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))


if __name__=='__main__':
    f=tf.ones([2,256,256,8])
    label=tf.ones([2,256,256,8])
    inputs = tf.keras.Input(shape=(256,256,8))
    outputs=batchNormalization([8],True)(inputs)
    Model = tf.keras.Model(inputs=inputs, outputs=outputs)
    Layer = batchNormalization([8],True)
    print(len(Model.variables))
    print(len(Model.trainable_variables))
    print(len(Layer.variables))
    print(len(Layer.trainable_variables))
    optimizer=tf.keras.optimizers.Adam(learning_rate=0.001)
    for i in range(0,100):
        train_step(Layer, f, label,optimizer)
        # train_step(Model,f,label,optimizer)
培训时,出现了另一个错误:
TypeError:函数构建代码之外的op正在被传递一个“图形”张量。
通过在函数构建代码中包含tf.init\u范围,图形张量可能会从函数构建上下文中泄漏。

替换

self.moving_mean = self.moving_mean * 0.09 + wu * 0.01
self.moving_var = self.moving_var * 0.09 + var * 0.01
``
by 
self.moving_mean.assign(self.moving_mean * 0.09 + wu * 0.01)
self.moving_var.assign(self.moving_var * 0.09 + var * 0.01)

可以解决这个问题。

为什么要将
moving_-mean
moving_-mean
作为不可训练的
tf.Variable
,而不是
tf.constant
?就此而言,为什么不简单地使用Python变量呢?因为我想跟踪self.moving_-mean和self.moving_-var,并在层外更改它们的值。