Deterministic 是TensorRT“；浮点16“；Jetson TX2上的精度模式不确定？_Deterministic_Non Deterministic_Tensorrt_Nvidia Jetson_Half Precision Float

Deterministic 是TensorRT“；浮点16“；Jetson TX2上的精度模式不确定？

Deterministic 是TensorRT“；浮点16“；Jetson TX2上的精度模式不确定？,deterministic,non-deterministic,tensorrt,nvidia-jetson,half-precision-float,Deterministic,Non Deterministic,Tensorrt,Nvidia Jetson,Half Precision Float,我正在使用TensorRTFP16精度模式来优化我的深度学习模型。我在Jetson TX2上使用了这个优化模型。在测试模型时，我观察到TensorRT推理机是不确定的。换句话说，我的优化模型为相同的输入图像提供了40到120 FPS之间的不同FPS值当我看到关于CUDA的评论时，我开始认为非确定性的根源是浮点运算： “如果您的代码使用浮点原子，则结果可能与运行结果不同由于浮点操作通常不可用而运行关联，以及数据进入计算的顺序（例如当使用原子时，sum）是不确定的。” 像FP16、FP32和

我正在使用TensorRTFP16精度模式来优化我的深度学习模型。我在Jetson TX2上使用了这个优化模型。在测试模型时，我观察到TensorRT推理机是不确定的。换句话说，我的优化模型为相同的输入图像提供了40到120 FPS之间的不同FPS值
当我看到关于CUDA的评论时，我开始认为非确定性的根源是浮点运算：
“如果您的代码使用浮点原子，则结果可能与运行结果不同由于浮点操作通常不可用而运行关联，以及数据进入计算的顺序（例如当使用原子时，sum）是不确定的。”
像FP16、FP32和INT8这样的精度类型会影响TensorRT的决定论吗？还是别的什么
你有什么想法吗

致以最诚挚的问候。
我通过更改用于测量延迟的函数clock（）解决了这个问题。clock（）函数用于测量CPU时间延迟，但我想做的是测量实时延迟。现在我使用std:：chrono来测量延迟。现在推理结果是确定的
那是错误的一个，（）
使用像这样的Cuda事件，（）
像这样使用chrono:（）

#包括 #包括 #包括 int main（） { 自动启动=标准：：时钟：：系统时钟：：现在（）；推理引擎（）；//执行推理自动结束=标准：：时钟：：系统时钟：：现在（）； std:：chrono:：持续时间（秒）=结束-开始； std:：time\u t end\u time=std:：chrono:：system\u clock:：to\u time\u t（end）； std:：cout我通过更改用于测量延迟的函数clock（）解决了这个问题。clock（）函数用于测量CPU时间延迟，但我想做的是测量实时延迟。现在我使用std:：chrono来测量延迟。现在推断结果是延迟确定的那是错误的一个，（）使用像这样的Cuda事件，（）像这样使用chrono:（） #包括 #包括 #包括 int main（） { 自动启动=标准：：时钟：：系统时钟：：现在（）；推理引擎（）；//执行推理自动结束=标准：：时钟：：系统时钟：：现在（）； std:：chrono:：持续时间（秒）=结束-开始； std:：time\u t end\u time=std:：chrono:：system\u clock:：to\u time\u t（end）；标准：：cout int main () { clock_t t; int f; t = clock(); inferenceEngine(); // Tahmin yapılıyor t = clock() - t; printf ("It took me %d clicks (%f seconds).\n",t,((float)t)/CLOCKS_PER_SEC); return 0; } cudaEvent_t start, stop; cudaEventCreate(&start); cudaEventCreate(&stop); cudaEventRecord(start); inferenceEngine(); // Do the inference cudaEventRecord(stop); cudaEventSynchronize(stop); float milliseconds = 0; cudaEventElapsedTime(&milliseconds, start, stop); #include <iostream> #include <chrono> #include <ctime> int main() { auto start = std::chrono::system_clock::now(); inferenceEngine(); // Do the inference auto end = std::chrono::system_clock::now(); std::chrono::duration<double> elapsed_seconds = end-start; std::time_t end_time = std::chrono::system_clock::to_time_t(end); std::cout << "finished computation at " << std::ctime(&end_time) << "elapsed time: " << elapsed_seconds.count() << "s\n"; }