Tensorflow 2指标使用2个GPU产生错误结果

Tensorflow 2指标使用2个GPU产生错误结果,tensorflow,metrics,auc,multi-gpu,Tensorflow,Metrics,Auc,Multi Gpu,我从tensorflow文档中获取了这段关于使用自定义循环进行分布式培训的代码,并将其修复为使用tf.keras.metrics.AUC并使用2个GPU(DGX机器上的2个Nvidia V100)运行 问题是AUC的评估肯定是错误的,因为它超出了它的范围(应该是0-100),我通过运行上述代码一次得到了这些结果: Epoch 1, Loss: 1.8061423301696777, Accuracy: 66.00833892822266, AUC: 321.8688659667969,Test

我从tensorflow文档中获取了这段关于使用自定义循环进行分布式培训的代码,并将其修复为使用tf.keras.metrics.AUC并使用2个GPU(DGX机器上的2个Nvidia V100)运行

问题是AUC的评估肯定是错误的,因为它超出了它的范围(应该是0-100),我通过运行上述代码一次得到了这些结果:

Epoch 1, Loss: 1.8061423301696777, Accuracy: 66.00833892822266, AUC: 321.8688659667969,Test Loss: 1.742477536201477, Test Accuracy: 72.0999984741211, Test AUC: 331.33709716796875
Epoch 2, Loss: 1.7129968404769897, Accuracy: 74.9816665649414, AUC: 337.37017822265625,Test Loss: 1.7084736824035645, Test Accuracy: 75.52999877929688, Test AUC: 337.1878967285156
Epoch 3, Loss: 1.643971562385559, Accuracy: 81.83333587646484, AUC: 355.96209716796875,Test Loss: 1.6072628498077393, Test Accuracy: 85.3499984741211, Test AUC: 370.603759765625
Epoch 4, Loss: 1.5887378454208374, Accuracy: 87.27833557128906, AUC: 373.6204528808594,Test Loss: 1.5906082391738892, Test Accuracy: 87.13999938964844, Test AUC: 371.9998474121094
Epoch 5, Loss: 1.581775426864624, Accuracy: 88.0, AUC: 373.9468994140625,Test Loss: 1.5964380502700806, Test Accuracy: 86.68000030517578, Test AUC: 371.0227355957031
Epoch 6, Loss: 1.5764907598495483, Accuracy: 88.49166870117188, AUC: 375.2404479980469,Test Loss: 1.5832056999206543, Test Accuracy: 87.94000244140625, Test AUC: 373.41998291015625
Epoch 7, Loss: 1.5698528289794922, Accuracy: 89.19166564941406, AUC: 376.473876953125,Test Loss: 1.5770654678344727, Test Accuracy: 88.58000183105469, Test AUC: 375.5516662597656
Epoch 8, Loss: 1.564456820487976, Accuracy: 89.71833801269531, AUC: 377.8564758300781,Test Loss: 1.5792100429534912, Test Accuracy: 88.27000427246094, Test AUC: 373.1791687011719
Epoch 9, Loss: 1.5612279176712036, Accuracy: 90.02000427246094, AUC: 377.9949645996094,Test Loss: 1.5729509592056274, Test Accuracy: 88.9800033569336, Test AUC: 375.5257263183594
Epoch 10, Loss: 1.5562015771865845, Accuracy: 90.54000091552734, AUC: 378.9789123535156,Test Loss: 1.56815767288208, Test Accuracy: 89.3499984741211, Test AUC: 375.8636474609375
准确度是可以的,但它似乎是唯一一个表现良好的指标。我也尝试了其他指标,但没有正确评估。问题似乎出现在使用多个GPU时,因为当我使用一个GPU运行此代码时,它会产生正确的结果

Epoch 1, Loss: 1.8061423301696777, Accuracy: 66.00833892822266, AUC: 321.8688659667969,Test Loss: 1.742477536201477, Test Accuracy: 72.0999984741211, Test AUC: 331.33709716796875
Epoch 2, Loss: 1.7129968404769897, Accuracy: 74.9816665649414, AUC: 337.37017822265625,Test Loss: 1.7084736824035645, Test Accuracy: 75.52999877929688, Test AUC: 337.1878967285156
Epoch 3, Loss: 1.643971562385559, Accuracy: 81.83333587646484, AUC: 355.96209716796875,Test Loss: 1.6072628498077393, Test Accuracy: 85.3499984741211, Test AUC: 370.603759765625
Epoch 4, Loss: 1.5887378454208374, Accuracy: 87.27833557128906, AUC: 373.6204528808594,Test Loss: 1.5906082391738892, Test Accuracy: 87.13999938964844, Test AUC: 371.9998474121094
Epoch 5, Loss: 1.581775426864624, Accuracy: 88.0, AUC: 373.9468994140625,Test Loss: 1.5964380502700806, Test Accuracy: 86.68000030517578, Test AUC: 371.0227355957031
Epoch 6, Loss: 1.5764907598495483, Accuracy: 88.49166870117188, AUC: 375.2404479980469,Test Loss: 1.5832056999206543, Test Accuracy: 87.94000244140625, Test AUC: 373.41998291015625
Epoch 7, Loss: 1.5698528289794922, Accuracy: 89.19166564941406, AUC: 376.473876953125,Test Loss: 1.5770654678344727, Test Accuracy: 88.58000183105469, Test AUC: 375.5516662597656
Epoch 8, Loss: 1.564456820487976, Accuracy: 89.71833801269531, AUC: 377.8564758300781,Test Loss: 1.5792100429534912, Test Accuracy: 88.27000427246094, Test AUC: 373.1791687011719
Epoch 9, Loss: 1.5612279176712036, Accuracy: 90.02000427246094, AUC: 377.9949645996094,Test Loss: 1.5729509592056274, Test Accuracy: 88.9800033569336, Test AUC: 375.5257263183594
Epoch 10, Loss: 1.5562015771865845, Accuracy: 90.54000091552734, AUC: 378.9789123535156,Test Loss: 1.56815767288208, Test Accuracy: 89.3499984741211, Test AUC: 375.8636474609375