TensorFlow won';t使用自定义docker映像和python 3.6检测GPU
(在将问题提交给tensorflow之前,按照问题模板的建议在此处发布) 我正在尝试用python 3.6构建tensorflow docker映像,我有以下TensorFlow won';t使用自定义docker映像和python 3.6检测GPU,tensorflow,python-3.6,cudnn,tensorflow-gpu,nvidia-docker,Tensorflow,Python 3.6,Cudnn,Tensorflow Gpu,Nvidia Docker,(在将问题提交给tensorflow之前,按照问题模板的建议在此处发布) 我正在尝试用python 3.6构建tensorflow docker映像,我有以下Dockerfile FROM nvidia/cuda:8.0-cudnn5-devel-ubuntu16.04 RUN apt-get update \ && apt-get install -y --no-install-recommends \ build-essential \ c
Dockerfile
FROM nvidia/cuda:8.0-cudnn5-devel-ubuntu16.04
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
build-essential \
curl \
libfreetype6-dev \
libpng12-dev \
libzmq3-dev \
pkg-config \
rsync \
software-properties-common \
unzip \
libcupti-dev \
&& add-apt-repository -y ppa:jonathonf/python-3.6 \
&& apt-get update \
&& apt-get install -y python3.6 python3.6-dev \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
RUN curl -O https://bootstrap.pypa.io/get-pip.py \
&& python3.6 get-pip.py \
&& rm get-pip.py
RUN python3.6 -m pip install --no-cache-dir -U ipython pip setuptools
RUN python3.6 -m pip install --no-cache-dir tensorflow
RUN ln -s /usr/bin/python3.6 /usr/bin/python
ENV LD_LIBRARY_PATH /usr/local/cuda-8.0/lib64:/usr/local/cuda/extras/CUPTI/lib64:$LD_LIBRARY_PATH
ENV CUDA_HOME /usr/local/cuda-8.0
CMD ["ipython"]
我构建映像并运行一个脚本,该脚本强制执行gpu:0
:
nvidia-docker build -t tensorflow .
... (builds successfully)
nvidia-docker run --rm -v $PWD/test.py:/test.py tensorflow python /test.py
...
InvalidArgumentError (see above for traceback): Cannot assign a device for operation 'b': Operation was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/cpu:0 ]. Make sure the device specification refers to a valid device.
[[Node: b = Const[dtype=DT_FLOAT, value=Tensor<type: float shape: [3,2] values: [1 2][3]...>, _device="/device:GPU:0"]()]]
我做错了什么
(test.py
is just):
(我试过使用一个基本映像
nvidia/cuda:8.0-cudnn6-devel-ubuntu16.04
,它是由tensorflow/tensorflow:latest gpu
使用的,但没有用)结果是,安装tensorflow gpu
非常简单,奇怪的是tensorflow文档没有解释这一点,但基本上这就是我的愚蠢
nvidia-docker run --rm tensorflow bash -c "nvidia-smi; nvcc --version; cat /usr/include/cudnn.h | grep CUDNN_MAJOR -A 2"
Sun Jul 23 22:50:11 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 750 Off | 0000:01:00.0 On | N/A |
| 21% 35C P8 1W / 38W | 795MiB / 976MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
#define CUDNN_MAJOR 5
#define CUDNN_MINOR 1
#define CUDNN_PATCHLEVEL 10
--
#define CUDNN_VERSION (CUDNN_MAJOR * 1000 + CUDNN_MINOR * 100 + CUDNN_PATCHLEVEL)
#include "driver_types.h"
import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
print(sess.run(c))