Installation Tensorflow2无法在Ubuntu 18.04上运行。可能安装不正确,但是什么?

Installation Tensorflow2无法在Ubuntu 18.04上运行。可能安装不正确,但是什么?,installation,ubuntu-18.04,tensorflow2.0,Installation,Ubuntu 18.04,Tensorflow2.0,我在Ubuntu18.04.3上安装了Tensorflow2.1和cuda 10.1、cudnn7.6.5.32、Nvidia驱动程序430.5 我无法正确地按照tensorflow站点上的说明操作,因为许多部件无法工作,但是,经过许多小时,我终于安装了所有组件。当我尝试运行一个20行mnist示例时,我得到: 2020-02-19 03:02:24.915143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found de

我在Ubuntu18.04.3上安装了Tensorflow2.1和cuda 10.1、cudnn7.6.5.32、Nvidia驱动程序430.5

我无法正确地按照tensorflow站点上的说明操作,因为许多部件无法工作,但是,经过许多小时,我终于安装了所有组件。当我尝试运行一个20行mnist示例时,我得到:

2020-02-19 03:02:24.915143: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce GTX 1080 Ti computeCapability: 6.1
coreClock: 1.683GHz coreCount: 28 deviceMemorySize: 10.91GiB deviceMemoryBandwidth: 451.17GiB/s
2020-02-19 03:02:24.915194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-19 03:02:24.915216: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-19 03:02:24.915234: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-19 03:02:24.915253: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-19 03:02:24.915271: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-19 03:02:24.915289: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-19 03:02:24.915308: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-19 03:02:24.917997: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-02-19 03:02:24.918060: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-19 03:02:24.920974: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-19 03:02:24.921000: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-02-19 03:02:24.921013: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-02-19 03:02:24.924091: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10258 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:01:00.0, compute capability: 6.1)
Train on 60000 samples
Epoch 1/5
2020-02-19 03:02:26.155747: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-19 03:02:26.156063: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-02-19 03:02:26.156110: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-02-19 03:02:26.156225: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-02-19 03:02:26.156253: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-02-19 03:02:26.156483: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-02-19 03:02:26.158110: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-02-19 03:02:26.158133: W tensorflow/stream_executor/stream.cc:2041] attempting to perform BLAS operation using StreamExecutor without BLAS support
2020-02-19 03:02:26.158158: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Internal: Blas GEMM launch failed : a.shape=(32, 784), b.shape=(784, 128), m=32, n=128, k=784
     [[{{node sequential/dense/MatMul}}]]
我知道此错误可能意味着安装中存在错误,但我如何确定是什么错误?有没有办法确定使用哪个版本的cudnn


我在谷歌上搜索了很多地方,有很多人都有同样的问题,但没有解决方案

我花了两天时间试图让这些垃圾发挥作用。最后,我永远也不知道为什么它终于开始起作用了。我现在知道我所做的一切原则上都是正确的,但是,尽管安装显然有效,但一个简单的mnist示例失败,CUBLAS_STATUS_NOT_INITIALIZED

当我最终让它工作时,我:

  • 在我开始之前删除了所有与Cuda相关的软件包
  • 按照程序进行。这比官方文件要清楚得多
  • 在步骤4)安装cuda 10.1时,我执行了以下操作:

    sudo apt get安装cuda-10-1

    而不是:

    sudo-apt-get-install-cuda

这确保了cuda在编写本文时不会自动升级到最新版本(10.2)