Tensorflow自动混合精度fp16比官方resnet上的fp32慢

Tensorflow自动混合精度fp16比官方resnet上的fp32慢,tensorflow,half-precision-float,Tensorflow,Half Precision Float,我正在尝试使用从的官方ResNet模型基准测试来测试tensorflow gpu==1.14.0rc0中包含的AMP支持。我在2080TI上运行,驱动程序410.78,CUDA10,Ubuntu 我做了以下更改,以帮助确保快速、准确地进行比较: 将纪元减少到10 删除了调整后的运行的2倍大的批处理大小,以便在相同数量的样本上进行训练 将检查点设置为仅在培训结束后发生一次 切换到使用CIFAR-10的培训,因为我已经在本地磁盘上下载了 我在日志中看到了这一点,这对我来说意味着AMP处于活动状态

我正在尝试使用从的官方ResNet模型基准测试来测试
tensorflow gpu==1.14.0rc0
中包含的AMP支持。我在2080TI上运行,驱动程序410.78,CUDA10,Ubuntu

我做了以下更改,以帮助确保快速、准确地进行比较:

  • 将纪元减少到10
  • 删除了调整后的
    运行的2倍大的批处理大小,以便在相同数量的样本上进行训练
  • 将检查点设置为仅在培训结束后发生一次
  • 切换到使用CIFAR-10的培训,因为我已经在本地磁盘上下载了
我在日志中看到了这一点,这对我来说意味着AMP处于活动状态:

2019-06-03 16:08:40.976829: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-06-03 16:08:40.977057: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.985402: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.986858: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.987745: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:40.996781: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.001948: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.003208: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.004589: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.005981: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1241] No whitelist ops found, nothing to do
2019-06-03 16:08:41.511761: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1767] Running auto_mixed_precision graph optimizer
2019-06-03 16:08:41.527751: I tensorflow/core/grappler/optimizers/auto_mixed_precision.cc:1723] Converted 529/2910 nodes to float16 precision using 3 cast(s) to float16 (excluding Const and Variable casts)
但实际运行时间较慢:

我能做些什么来提高绩效