Python Tensoflow错误:无法创建cudnn句柄:cudnn\u状态\u未初始化

Python Tensoflow错误:无法创建cudnn句柄:cudnn\u状态\u未初始化,python,tensorflow,Python,Tensorflow,我的电脑规格是: 视窗10 cuda 11.2 cudnn 8.0.5 英伟达geforce GTX 3080 我使用此web()安装更快的rcnn。当我训练这个网络时,它出现了一个错误: 2021-01-24 18:12:47.713443: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED 2021-01-24 18:12

我的电脑规格是: 视窗10 cuda 11.2 cudnn 8.0.5 英伟达geforce GTX 3080

我使用此web()安装更快的rcnn。当我训练这个网络时,它出现了一个错误:

2021-01-24 18:12:47.713443: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2021-01-24 18:12:47.715010: E tensorflow/stream_executor/cuda/cuda_dnn.cc:340] Error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
2021-01-24 18:12:47.718097: E tensorflow/stream_executor/cuda/cuda_dnn.cc:336] Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2021-01-24 18:12:47.719553: E tensorflow/stream_executor/cuda/cuda_dnn.cc:340] Error retrieving driver version: Unimplemented: kernel reported driver version not implemented on Windows
Traceback (most recent call last):
  File "model_main_tf2.py", line 113, in <module>
    tf.compat.v1.app.run()
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\absl\app.py", line 300, in run
    _run_main(main, args)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\absl\app.py", line 251, in _run_main
    sys.exit(main(argv))
  File "model_main_tf2.py", line 104, in main
    model_lib_v2.train_loop(
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\object_detection\model_lib_v2.py", line 561, in train_loop
    load_fine_tune_checkpoint(detection_model,
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\object_detection\model_lib_v2.py", line 361, in load_fine_tune_checkpoint
    strategy.run(
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\distribute\distribute_lib.py", line 1259, in run
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\distribute\distribute_lib.py", line 2730, in call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\distribute\mirrored_strategy.py", line 628, in _call_for_each_replica
    return mirrored_run.call_for_each_replica(
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\distribute\mirrored_run.py", line 75, in call_for_each_replica
    return wrapped(args, kwargs)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\eager\def_function.py", line 828, in __call__
    result = self._call(*args, **kwds)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\eager\def_function.py", line 888, in _call
    return self._stateless_fn(*args, **kwds)
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\eager\function.py", line 2942, in __call__
    return graph_function._call_flat(
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\eager\function.py", line 1918, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\eager\function.py", line 555, in call
    outputs = execute.execute(
  File "C:\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\eager\execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[node model/conv1_conv/Conv2D (defined at \site-packages\object_detection\meta_architectures\faster_rcnn_meta_arch.py:1346) ]]
         [[Loss/RPNLoss/BalancedPositiveNegativeSampler/Cast_8/_192]]
  (1) Unknown:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
         [[node model/conv1_conv/Conv2D (defined at \site-packages\object_detection\meta_architectures\faster_rcnn_meta_arch.py:1346) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference__dummy_computation_fn_16411]

Errors may have originated from an input operation.
Input Source operations connected to node model/conv1_conv/Conv2D:
 model/lambda/Pad (defined at \site-packages\object_detection\models\keras_models\resnet_v1.py:49)

Input Source operations connected to node model/conv1_conv/Conv2D:
 model/lambda/Pad (defined at \site-packages\object_detection\models\keras_models\resnet_v1.py:49)

Function call stack:
_dummy_computation_fn -> _dummy_computation_fn
2021-01-24 18:12:47.713443:E tensorflow/stream\u executor/cuda/cuda\u dnn.cc:336]无法创建cudnn句柄:cudnn\u状态\u未初始化
2021-01-24 18:12:47.715010:E tensorflow/stream_executor/cuda/cuda_dnn.cc:340]检索驱动程序版本时出错:未实现:内核报告的驱动程序版本未在Windows上实现
2021-01-24 18:12:47.718097:E tensorflow/stream_executor/cuda/cuda_dnn.cc:336]无法创建cudnn句柄:cudnn_状态未初始化
2021-01-24 18:12:47.719553:E tensorflow/stream_executor/cuda/cuda_dnn.cc:340]检索驱动程序版本时出错:未实现:内核报告的驱动程序版本未在Windows上实现
回溯(最近一次呼叫最后一次):
文件“model_main_tf2.py”,第113行,在
tf.compat.v1.app.run()
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\platform\app.py”,第40行,正在运行
_运行(main=main,argv=argv,flags\u parser=\u parse\u flags\u tolerate\u unde)
文件“C:\Anaconda\envs\tensorflow\lib\site packages\absl\app.py”,第300行,正在运行
_运行_main(main,args)
文件“C:\Anaconda\envs\tensorflow\lib\site packages\absl\app.py”,第251行,位于主
系统出口(主(argv))
文件“model_main_tf2.py”,第104行,在main中
模型库v2.train循环(
文件“C:\Anaconda\envs\tensorflow\lib\site packages\object\u detection\model\u lib\u v2.py”,第561行,列车循环中
加载\u微调\u检查点(检测\u模型,
文件“C:\Anaconda\envs\tensorflow\lib\site packages\object\u detection\model\u lib\u v2.py”,第361行,在加载\u微调\u检查点
策略.运行(
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\distribute\distribute_lib.py”,第1259行,正在运行
返回self.\u扩展。为每个\u副本调用\u(fn,args=args,kwargs=kwargs)
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\distribute\distribute\u lib.py”,第2730行,用于调用每个副本
返回自我。为每个副本(fn、ARG、kwargs)调用
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\distribute\mirrored\u strategy.py”,第628行,位于每个副本的调用中
返回镜像的\u运行。为每个\u副本调用\u(
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\distribute\mirrored\u run.py”,第75行,用于调用每个副本
已包装退货(args、kwargs)
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\eager\def\u function.py”,第828行,在\uu调用中__
结果=自身调用(*args,**kwds)
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\eager\def_function.py”,第888行,在调用中
返回self.\u无状态\u fn(*args,**kwds)
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\eager\function.py”,第2942行,在调用中__
返回图\函数。\调用\平面(
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\eager\function.py”,第1918行,位于调用平面中
返回self.\u构建\u调用\u输出(self.\u推断\u函数.call(
调用中第555行的文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\eager\function.py”
输出=execute.execute(
文件“C:\Anaconda\envs\tensorflow\lib\site packages\tensorflow\python\eager\execute.py”,第59行,在quick\u execute中
张量=pywrap\u tfe.tfe\u Py\u Execute(ctx.\u句柄、设备名称、操作名称、,
tensorflow.python.framework.errors\u impl.UnknownError:找到2个根错误。
(0)未知:无法获取卷积算法。这可能是因为cuDNN未能初始化,因此请尝试查看上面是否打印了警告日志消息。
[[节点模型/conv1\u conv/Conv2D(定义于\site packages\object\u detection\meta\u architecture\faster\u rcnn\u meta\u arch.py:1346)]]
[[Loss/RPNLoss/BalancedPositiveNegativeSampler/Cast_8/_192]]
(1) 未知:无法获取卷积算法。这可能是因为cuDNN未能初始化,因此请尝试查看上面是否打印了警告日志消息。
[[节点模型/conv1\u conv/Conv2D(定义于\site packages\object\u detection\meta\u architecture\faster\u rcnn\u meta\u arch.py:1346)]]
0成功的操作。
忽略0个派生错误。[Op:_推断_伪计算_fn_16411]
错误可能源于输入操作。
连接到节点模型/conv1\U conv/Conv2D的输入源操作:
model/lambda/Pad(定义于\site packages\object\u detection\models\keras\u models\resnet\u v1.py:49)
连接到节点模型/conv1\U conv/Conv2D的输入源操作:
model/lambda/Pad(定义于\site packages\object\u detection\models\keras\u models\resnet\u v1.py:49)
函数调用堆栈:
_虚拟计算\u fn->\u虚拟计算\u fn

如何解决这个问题?

你能分享一下你的tensorflow版本吗,
我相信Tensorflow没有支持CUDA 11.2的Tensorflow版本。如果您花时间阅读我安装的Tensorflow gpu=2.4.1,您正在使用的版本的发行说明会清楚地说明支持哪些版本。如何解决?我查阅了您尝试使用的repo,我看到作者还声明code使用tensorflow 2.3进行了测试,这意味着他们也使用了cuda 10.1。选择将cuda和tensorflow库降级。虽然我使用cuda 10.1、cudnn 8.0.4和tensorflow gpu 2.3.0训练了更快的RCNN,但损失是nan。如何解决这个问题?