Python TypeError:启用混合精度时使用自定义激活功能？_Python_Tensorflow_Keras

Python TypeError:启用混合精度时使用自定义激活功能？

python tensorflow keras

Python TypeError:启用混合精度时使用自定义激活功能？,python,tensorflow,keras,Python,Tensorflow,Keras,我试图在启用了混合精度的训练管道中使用自定义激活，但遇到以下错误： TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type float16 of argument 'x'. 复制启用混合精度 import tensorflow as tf policy = tf.keras.mixed_precision.experimental.Policy('mixed_float16') tf.keras

我试图在启用了混合精度的训练管道中使用自定义激活，但遇到以下错误：

TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type float16 of argument 'x'.
复制启用混合精度

import tensorflow as tf policy = tf.keras.mixed_precision.experimental.Policy('mixed_float16') tf.keras.mixed_precision.experimental.set_policy(policy) print('Mixed precision enabled')
自定义激活

def ARelu(x, alpha=0.90, beta=2.0): alpha = tf.clip_by_value(alpha, clip_value_min=0.01, clip_value_max=0.99) beta = 1 + tf.math.sigmoid(beta) return tf.nn.relu(x) * beta - tf.nn.relu(-x) * alpha
训练

import tensorflow as tf (xtrain, ytrain), (xtest, ytest) = tf.keras.datasets.mnist.load_data() def pre_process(inputs, targets): inputs = tf.expand_dims(inputs, -1) targets = tf.one_hot(targets, depth=10) return tf.divide(inputs, 255), targets train_data = tf.data.Dataset.from_tensor_slices((xtrain, ytrain)).\ take(10_000).shuffle(10_000).batch(8).map(pre_process) test_data = tf.data.Dataset.from_tensor_slices((xtest, ytest)).\ take(1_000).shuffle(1_000).batch(8).map(pre_process) model = tf.keras.Sequential([ tf.keras.layers.Conv2D(filters=16, kernel_size=(3, 3), strides=(1, 1), input_shape=(28, 28, 1), activation=ARelu), tf.keras.layers.MaxPool2D(pool_size=(2, 2)), tf.keras.layers.Conv2D(filters=32, kernel_size=(3, 3), strides=(1, 1), activation=ARelu), tf.keras.layers.MaxPool2D(pool_size=(2, 2)), tf.keras.layers.Flatten(), tf.keras.layers.Dense(64, activation=ARelu), tf.keras.layers.Dense(10, activation='softmax', dtype=tf.float32)]) opt = tf.keras.optimizers.Adam() model.compile(loss='categorical_crossentropy', optimizer=opt) history = model.fit(train_data, validation_data=test_data, epochs=10) # ------------------ TypeError: Input 'y' of 'Mul' Op has type float32 that does not match type float16 of argument 'x'.
但是，如果没有混合精度，它可以工作。我理解这个问题，但我应该在哪里调查呢

另外，在试图解决这个问题时，我发现使用
tf.keras.mixed_precision.lossCaleOptimizer
可以安全地避免数值下溢。这是我们应该用于混合精度训练的东西吗？
要解决这个问题，我必须将输入转换为
float32
。我不确定这是否是解决这个错误的正确方法

def ARelu(x, alpha=0.90, beta=2.0): alpha = tf.clip_by_value(alpha, clip_value_min=0.01, clip_value_max=0.99) beta = 1 + tf.math.sigmoid(beta) x = tf.cast(x, 'float32') return tf.nn.relu(x) * beta - tf.nn.relu(-x) * alpha
只需将文本转换为
float32
，它就可以工作了
细节然而，事实是，为了利用
混合精度
，我们必须执行以下操作：

# At the beginning .... policy = tf.keras.mixed_precision.experimental.Policy('mixed_float16') tf.keras.mixed_precision.experimental.set_policy(policy) print('Mixed precision enabled')
及
老实说，直到现在，我还不知道
混合精度
机制背后是如何工作的。首先，它设置策略
mixed_float16
，并将输出激活转换为
tf.float32
。这样，我们就无法使用自定义激活函数，除非将输入
x
类型转换为
float32
，我认为这是
float16
的混合精度

# at the last layer, adding dtype as float 32 tf.keras.layers.Dense(num_classes, activation=..., dtype=tf.float32)])