使用Keras/TensorFlow培训RTX卡的fp16/半精度_Tensorflow_Keras_Rtx_Half Precision Float

使用Keras/TensorFlow培训RTX卡的fp16/半精度

tensorflow keras

使用Keras/TensorFlow培训RTX卡的fp16/半精度,tensorflow,keras,rtx,half-precision-float,Tensorflow,Keras,Rtx,Half Precision Float,我刚买了一台RTX2070 Super，我想尝试使用Keras和TensorFlow后端进行半精度训练到目前为止，我发现类似的文章建议使用以下设置： import keras.backend as K dtype='float16' K.set_floatx(dtype) # default is 1e-7 which is too small for float16. Without adjusting the epsilon, we will get NaN predictions

我刚买了一台RTX2070 Super，我想尝试使用Keras和TensorFlow后端进行半精度训练

到目前为止，我发现类似的文章建议使用以下设置：

import keras.backend as K

dtype='float16'
K.set_floatx(dtype)

# default is 1e-7 which is too small for float16.  Without adjusting the epsilon, we will get NaN predictions because of divide by zero problems
K.set_epsilon(1e-4)

该网络是一个简单的4层CNN，用于音频分类

我的输入数据是以前生成的NumPy 3D数组（使用LibROSA提取音频MFCC特征）。这些数据是使用CPU生成的，我知道这些值保存为32位浮点值

当我试图用这些数据训练我的网络时，我得到以下错误：

TypeError:传递给“Merge”Op的“inputs”的列表中的张量具有不完全匹配的类型[float16，float32]

在另一篇文章中，我读到我还应该“追溯到SoftMax层之前的FP32”，是什么让事情变得更加混乱

我真的很想了解一些情况

谢谢

如果不知道模型体系结构，就很难知道数据类型不匹配的原因。但是，我认为它在合并之前有一个BatchNorm层

在这种情况下，合并和softmax建议的原因是相同的，即在涉及计算统计数据（均值/方差）的操作期间，最好使用float32。这是因为使用float16时，精度误差可能太大，并且会给出不准确的结果，尤其是在分割期间

我还没有尝试过，但在Keras（至少2.2.5）的BatchNormalization层中，如果使用Tensorflow作为后端，则方差将转换为float32

   if K.backend() != 'cntk':
        sample_size = K.prod([K.shape(inputs)[axis]
                              for axis in reduction_axes])
        sample_size = K.cast(sample_size, dtype=K.dtype(inputs))
        if K.backend() == 'tensorflow' and sample_size.dtype != 'float32':
            sample_size = K.cast(sample_size, dtype='float32')

        # sample variance - unbiased estimator of population variance
        variance *= sample_size / (sample_size - (1.0 + self.epsilon))

可能归一化后得到的张量没有转换回float16，从而导致错误。要解决这个问题，您可以删除BatchNorm进行确认，然后修改keras的本地副本，或者实现一个自定义BatchNorm，在规范化后将其转换回“float16”