Tensorflow无法打开libcuda.so.1

Tensorflow无法打开libcuda.so.1,cuda,tensorflow,nvidia,Cuda,Tensorflow,Nvidia,我有一台装有GeForce 940 MX的笔记本电脑。我想让Tensorflow在gpu上运行。我从他们的教程页面安装了所有内容,现在当我导入Tensorflow时,我得到了 >>> import tensorflow as tf I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally I tensorflow/stream_ex

我有一台装有GeForce 940 MX的笔记本电脑。我想让Tensorflow在gpu上运行。我从他们的教程页面安装了所有内容,现在当我导入Tensorflow时,我得到了

>>> import tensorflow as tf
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened  CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: 
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH: 
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
 I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
>>> 
之后,我认为它只是切换到在cpu上运行

编辑:我用核武器把一切都炸了之后,从头开始。现在我明白了:

>>> import tensorflow
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcuda.so.1. LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: workLaptop
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: Not found: was unable to find libcuda.so DSO loaded into this program
I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: Permission denied: could not open driver version path for reading: /proc/driver/nvidia/version
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1092] LD_LIBRARY_PATH: :/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1093] failed to find libcuda.so on this system: Failed precondition: could not dlopen DSO: libcuda.so.1; dlerror: libnvidia-fatbinaryloader.so.367.57: cannot open shared object file: No such file or directory
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally

libcuda.so.1是指向特定于NVIDIA驱动程序版本的文件的符号链接。它可能指向错误的版本,也可能不存在

# See where the link is pointing.  
ls  /usr/lib/x86_64-linux-gnu/libcuda.so.1 -la
# My result:
# lrwxrwxrwx 1 root root 19 Feb 22 20:40 \
# /usr/lib/x86_64-linux-gnu/libcuda.so.1 -> ./libcuda.so.375.39

# Make sure it is pointing to the right version. 
# Compare it with the installed NVIDIA driver.
nvidia-smi

# Replace libcuda.so.1 with a link to the correct version
cd /usr/lib/x86_64-linux-gnu
sudo ln -f -s libcuda.so.<yournvidia.version> libcuda.so.1
#查看链接指向的位置。

ls/usr/lib/x86_64-linux-gnu/libcuda.so.1-la #我的结果是: #lrwxrwxrwx 1根根部2月22日20:40\
#/usr/lib/x86_64-linux-gnu/libcuda.so.1->/libcuda.so.375.39 #确保它指向正确的版本。 #将其与已安装的NVIDIA驱动程序进行比较。 英伟达smi #用指向正确版本的链接替换libcuda.so.1 cd/usr/lib/x86_64-linux-gnu sudo ln-f-s libcuda.so。libcuda.so.1
现在,用同样的方法,从libcuda.so.1创建另一个符号链接,链接到您的应用程序中同名的链接


您可能还会发现,您需要在/usr/lib/x86_64-linux-gnu中创建一个指向libcuda.so.1的链接,名为libcuda.so

在我刚刚解决的问题中,它正在将GPU驱动程序更新为最新版本并安装cuda工具包。首先,添加了ppa并安装了GPU驱动程序:

sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt update
sudo apt install nvidia-390
添加ppa后,它显示了驱动程序版本的选项,而390是显示的最新“稳定”版本

然后安装cuda工具包:

sudo apt install nvidia-cuda-toolkit
然后重新启动:

sudo reboot

它将驱动程序更新为比第一步中最初安装的390更新的版本(它是410;这是AWS上的p2.xlarge实例)。

以防任何人仍然遇到此问题。首先确保添加
--runtime=nvidia
参数以运行容器

docker run --runtime=nvidia -t tensorflow/serving:latest-gpu

<> >代码> ToSoFrase/Service:最新的GPU<代码>是码头工人的图像名称。< /P>您是否安装了英伟达驱动程序?libcuda是驱动程序的一部分,而不是CUDA工具套件。使用
find/usr/-name'libcuda.so.1'
查找文件是否在我假定您添加到
LD\u library\u PATH的标准CUDA库目录中?如果没有,只需在cuda lib目录/usr/lib/x86_64-linux-gnu/libcuda.so.1和/usr/lib/i386 linux gnu/libcuda.so.1中创建一个指向它的符号链接。cuda lib目录具体在哪里?我将重复``无法打开驱动程序版本路径进行读取:/proc/driver/nvidia/version“这意味着您在运行Tensorflow时没有功能正常的CUDA驱动程序这看起来不像Tensorflow问题;相反,看起来您没有正确安装并运行NVidia驱动程序。一个测试:尝试运行“nvidia smi”。如果驱动程序安装正确,它应该打印可用GPU的列表。现在,以同样的方式,从libcuda.so.1创建另一个符号链接到LD_LIBRARY_PATH目录中同名的链接。这到底是怎么做到的?我的“LD\U库路径目录”是什么?谢谢!