Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/70.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Tensorflow 2.0 can';不使用GPU,cuDNN有问题吗无法获取卷积算法。这可能是因为cuDNN未能初始化_Tensorflow_Tensorflow2.0_Nvidia_Tensorflow2.x - Fatal编程技术网

Tensorflow 2.0 can';不使用GPU,cuDNN有问题吗无法获取卷积算法。这可能是因为cuDNN未能初始化

Tensorflow 2.0 can';不使用GPU,cuDNN有问题吗无法获取卷积算法。这可能是因为cuDNN未能初始化,tensorflow,tensorflow2.0,nvidia,tensorflow2.x,Tensorflow,Tensorflow2.0,Nvidia,Tensorflow2.x,我试图理解并调试我的代码。我尝试在GPU上使用在tf2.0/tf.keras下开发的CNN模型进行预测,但得到了这些错误消息。 有人能帮我修一下吗 这是我的环境配置 enviroments: python 3.6.8 tensorflow-gpu 2.0.0-rc0 nvidia 418.x CUDA 10.0 cuDNN 7.6+** 和日志文件 2019-09-28 13:10:59.833892: I tensorflow/stream_executor/platform/default

我试图理解并调试我的代码。我尝试在GPU上使用在tf2.0/tf.keras下开发的CNN模型进行预测,但得到了这些错误消息。 有人能帮我修一下吗

这是我的环境配置

enviroments:
python 3.6.8
tensorflow-gpu 2.0.0-rc0
nvidia 418.x
CUDA 10.0
cuDNN 7.6+**
和日志文件

2019-09-28 13:10:59.833892: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-09-28 13:11:00.228025: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-09-28 13:11:00.957534: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-28 13:11:00.963310: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-09-28 13:11:00.963416: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node mobilenetv2_1.00_192/Conv1/Conv2D}}]]
mobilenetv2_1.00_192/block_15_expand_BN/cond/then/_630/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0=====>GPU Available:  True
=====> 4 Physical GPUs, 1 Logical GPUs

mobilenetv2_1.00_192/block_15_expand_BN/cond/then/_630/Const_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_15_depthwise_BN/cond/then/_644/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_15_depthwise_BN/cond/then/_644/Const_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_15_project_BN/cond/then/_658/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_15_project_BN/cond/then/_658/Const_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_16_expand_BN/cond/then/_672/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_16_expand_BN/cond/then/_672/Const_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_16_depthwise_BN/cond/then/_686/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_16_depthwise_BN/cond/then/_686/Const_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_16_project_BN/cond/then/_700/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/block_16_project_BN/cond/then/_700/Const_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/Conv_1_bn/cond/then/_714/Const: (Const): /job:localhost/replica:0/task:0/device:GPU:0
mobilenetv2_1.00_192/Conv_1_bn/cond/then/_714/Const_1: (Const): /job:localhost/replica:0/task:0/device:GPU:0
Traceback (most recent call last):
  File "NSFW_Server.py", line 162, in <module>
    model.predict(initial_tensor)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py", line 915, in predict
    use_multiprocessing=use_multiprocessing)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 722, in predict
    callbacks=callbacks)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 393, in model_iteration
    batch_outs = f(ins_batch)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/backend.py", line 3625, in __call__
    outputs = self._graph_fn(*converted_inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1081, in __call__
    return self._call_impl(args, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1121, in _call_impl
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1224, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 511, in call
    ctx=ctx)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node mobilenetv2_1.00_192/Conv1/Conv2D (defined at /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py:1751) ]] [Op:__inference_keras_scratch_graph_10727]

Function call stack:
keras_scratch_graph

您必须检查CUDA+CUDNN+TensorFlow的版本是否正确(同时确保已安装所有版本)

下面给出了两个运行配置示例(更新最新版本的TENSORFLOW

  • Cuda
    11.0
    +CuDNN
    8.0.4
    +TensorFlow
    2.4.0

  • Cuda
    10.1
    +CuDNN
    7.6.5
    (通常情况下>
    7.6
    )+TensorFlow
    2.2.0
    /TensorFlow
    2.3.0
    (TF>
    2.1
    需要Cuda>
    10.1

  • Cuda
    10.1
    +CuDNN
    7.6.5
    (通常情况下>
    7.6
    )+TensorFlow
    2.1.0
    (TF>=
    2.1
    需要Cuda>=
    10.1

  • Cuda
    10.0
    +CuDNN
    7.6.3
    +/TensorFlow
    1.13
    /
    1.14
    /TensorFlow
    2.0

  • Cuda
    9.0
    +CuDNN
    7.0.5
    +TensorFlow
    1.10

  • 当您安装了不兼容版本的TensorFlow/CuDNN时,通常会出现此错误。在我的例子中,当我尝试使用旧的TensorFlow和新版本的CuDNN时,出现了这种情况

    **如果由于某种原因,您收到如下错误消息(之后什么也没有发生):

    依靠驱动程序执行ptx编译


    解决方案:安装最新的nvidia驱动程序

    检查操作系统的TensorFlow GPU指令。它在Ubuntu 16.04.6 LTS和Tensorflow 2.0上为我解决了这个问题。对于那些面临上述错误问题的人(对于Windows平台),我只是通过安装与系统中已经安装的CUDA兼容的CuDNN版本来解决这个问题

    • 此合适的版本可从网站下载。您可能需要Nvidia帐户。这可以通过提供邮件id和填写问卷轻松创建
    • 要检查CUDA版本,请运行
      NVCC--version
    • 下载合适的版本后,从zip文件中提取文件夹
    • 转到解压缩文件夹的bin文件夹。复制
      cudnn64:7.dll
      并将其粘贴到CUDA的bin文件夹中。在我的例子中,Cuda的安装位置是
      C:\Program Files\NVIDIA GPU Computing Toolkit\Cuda\v10.0\bin
    • 这很可能解决问题
  • 我的系统详细信息:

  • 视窗10
  • CUDA 10.0
  • TensorFlow 2.0
  • GPU-Nvidia GTX 1060

  • 我还发现这个博客非常有用。

    谢谢,为了确保CUDA/cuDNN/TF是正确的版本,我从docker hub中提取了一个图像,它是“tensorflow/tensorflow:2.0.0rc0-gpu-py3”,并在容器中运行了我的代码…但它仍然不起作用,并出现相同的错误消息。请尝试手动安装它们,然后再次检查Docker映像中安装的依赖项。一定有一点不同,你错过了。谢谢,有人尝试过任何上述版本吗?像cuda 10.1+CuDNN 7.64。我修改了答案,使之更清晰。Cuda 10.0不仅仅是10,因为10.0和10.1之间存在差异。@我已经更新了最新TensorFlow的答案
    if __name__ == "__main__":
    
        print("=====>GPU Available: ", tf.test.is_gpu_available())
        tf.debugging.set_log_device_placement(True)
    
        gpus = tf.config.experimental.list_physical_devices('GPU')
        if gpus:
            try:
                # Currently, memory growth needs to be the same across GPUs
    
                tf.config.experimental.set_visible_devices(gpus[0], 'GPU')
                tf.config.experimental.set_memory_growth(gpus[0], True)
                logical_gpus = tf.config.experimental.list_logical_devices('GPU')
                print("=====>", len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
            except RuntimeError as e:
                # Memory growth must be set before GPUs have been initialized
                print(e)
    
        paras_path = "./paras/{}".format(int(2011))
        model = tf.keras.experimental.load_from_saved_model(paras_path)
        initial_tensor = np.zeros((1, INPUT_SHAPE, INPUT_SHAPE, 3))
        model.predict(initial_tensor)