Python 经过很长一段时间后检测GPU的深度学习脚本

Python 经过很长一段时间后检测GPU的深度学习脚本,python,performance,tensorflow,Python,Performance,Tensorflow,脚本运行正确,并且使用GPU,正如我在CUDA GPU性能上看到的,当脚本最终运行时 但是,实际开始运行模型需要166秒,运行模型需要3秒 我的设置如下所示: GPU NVIDIA RTX3060 CUDA 10.1 CUDNN 7.6.5 Python 3.8.6 tensorboard 2.2

脚本运行正确,并且使用GPU,正如我在CUDA GPU性能上看到的,当脚本最终运行时

但是,实际开始运行模型需要166秒,运行模型需要3秒

我的设置如下所示:

GPU                        NVIDIA RTX3060
CUDA                       10.1
CUDNN                      7.6.5
Python                     3.8.6
tensorboard                2.2.2
tensorboard-plugin-wit     1.8.0
tensorflow-addons          0.10.0
tensorflow-estimator       2.4.0
tensorflow-gpu             2.2.0
tensorflow-gpu-estimator   2.2.0
2021-04-19 15:01:52.804967: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-04-19 15:01:55.462194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2021-04-19 15:01:55.487475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2021-04-19 15:01:55.499557: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-04-19 15:01:55.508753: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-04-19 15:01:55.516485: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-04-19 15:01:55.521269: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-04-19 15:01:55.533892: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-04-19 15:01:55.540034: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-04-19 15:01:55.553184: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-04-19 15:01:55.563273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-04-19 15:01:55.571440: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-04-19 15:01:55.590248: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23109067890 initialized for platform Host (this does not guarantee that XLA will be 
used). Devices:
2021-04-19 15:01:55.604730: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-04-19 15:01:55.613130: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2021-04-19 15:01:55.634118: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-04-19 15:01:55.644562: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-04-19 15:01:55.654208: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-04-19 15:01:55.663859: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-04-19 15:01:55.674225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-04-19 15:01:55.683850: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-04-19 15:01:55.694031: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-04-19 15:01:55.703614: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-04-19 15:02:52.267051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-19 15:02:52.272082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0
2021-04-19 15:02:52.275738: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2021-04-19 15:02:52.278956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9510 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:01:00.0, compute capability: 8.6)
2021-04-19 15:02:52.291225: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23128d9a5e0 initialized for platform CUDA (this does not guarantee that XLA will be 
used). Devices:
2021-04-19 15:02:52.298843: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3060, Compute Capability 8.6
2021-04-19 15:02:53.490971: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
WARNING:tensorflow:From c:\Users\trinamiX\Desktop\test.py:21: Sequential.predict_classes (from tensorflow.python.keras.engine.sequential) is deprecated and will be removed after 2021-01-01.
Instructions for updating:
Please use instead:* `np.argmax(model.predict(x), axis=-1)`,   if your model does multi-class classification   (e.g. if it uses a `softmax` last-layer activation).* `(model.predict(x) > 0.5).astype("int32")`,   if your model does binary classification   (e.g. if it uses a `sigmoid` last-layer activation).
[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 0 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 1 (expected 1)
输出在以下位置暂停55秒:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-04-19 15:10:22.239882: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
再次运行并在以下位置暂停111秒:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-04-19 15:10:22.239882: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
输出如下:

GPU                        NVIDIA RTX3060
CUDA                       10.1
CUDNN                      7.6.5
Python                     3.8.6
tensorboard                2.2.2
tensorboard-plugin-wit     1.8.0
tensorflow-addons          0.10.0
tensorflow-estimator       2.4.0
tensorflow-gpu             2.2.0
tensorflow-gpu-estimator   2.2.0
2021-04-19 15:01:52.804967: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-04-19 15:01:55.462194: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2021-04-19 15:01:55.487475: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2021-04-19 15:01:55.499557: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-04-19 15:01:55.508753: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-04-19 15:01:55.516485: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-04-19 15:01:55.521269: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-04-19 15:01:55.533892: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-04-19 15:01:55.540034: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-04-19 15:01:55.553184: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-04-19 15:01:55.563273: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-04-19 15:01:55.571440: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2021-04-19 15:01:55.590248: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23109067890 initialized for platform Host (this does not guarantee that XLA will be 
used). Devices:
2021-04-19 15:01:55.604730: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2021-04-19 15:01:55.613130: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: NVIDIA GeForce RTX 3060 computeCapability: 8.6
coreClock: 1.837GHz coreCount: 28 deviceMemorySize: 12.00GiB deviceMemoryBandwidth: 335.32GiB/s
2021-04-19 15:01:55.634118: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2021-04-19 15:01:55.644562: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2021-04-19 15:01:55.654208: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2021-04-19 15:01:55.663859: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2021-04-19 15:01:55.674225: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2021-04-19 15:01:55.683850: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2021-04-19 15:01:55.694031: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2021-04-19 15:01:55.703614: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-04-19 15:02:52.267051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-04-19 15:02:52.272082: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108]      0
2021-04-19 15:02:52.275738: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0:   N 
2021-04-19 15:02:52.278956: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9510 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 3060, pci bus id: 0000:01:00.0, compute capability: 8.6)
2021-04-19 15:02:52.291225: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x23128d9a5e0 initialized for platform CUDA (this does not guarantee that XLA will be 
used). Devices:
2021-04-19 15:02:52.298843: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 3060, Compute Capability 8.6
2021-04-19 15:02:53.490971: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
WARNING:tensorflow:From c:\Users\trinamiX\Desktop\test.py:21: Sequential.predict_classes (from tensorflow.python.keras.engine.sequential) is deprecated and will be removed after 2021-01-01.
Instructions for updating:
Please use instead:* `np.argmax(model.predict(x), axis=-1)`,   if your model does multi-class classification   (e.g. if it uses a `softmax` last-layer activation).* `(model.predict(x) > 0.5).astype("int32")`,   if your model does binary classification   (e.g. if it uses a `sigmoid` last-layer activation).
[6.0, 148.0, 72.0, 35.0, 0.0, 33.6, 0.627, 50.0] => 0 (expected 1)
[1.0, 85.0, 66.0, 29.0, 0.0, 26.6, 0.351, 31.0] => 0 (expected 0)
[8.0, 183.0, 64.0, 0.0, 0.0, 23.3, 0.672, 32.0] => 1 (expected 1)
[1.0, 89.0, 66.0, 23.0, 94.0, 28.1, 0.167, 21.0] => 0 (expected 0)
[0.0, 137.0, 40.0, 35.0, 168.0, 43.1, 2.288, 33.0] => 1 (expected 1)
我正在运行的代码来自:

你可以在下面找到它:

# first neural network with keras make predictions
from numpy import loadtxt
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense
# load the dataset
dataset = loadtxt(r"C:\Users\trinamiX\Desktop\pima-indians-diabetes.data.csv", delimiter=',')
# split into input (X) and output (y) variables
X = dataset[:,0:8]
y = dataset[:,8]
# define the keras model
model = Sequential()
model.add(Dense(12, input_dim=8, activation='relu'))
model.add(Dense(8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# compile the keras model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
# fit the keras model on the dataset
model.fit(X, y, epochs=150, batch_size=10, verbose=0)
# make class predictions with the model
predictions = model.predict_classes(X)
# summarize the first 5 cases
for i in range(5):
    print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))

非常感谢您的帮助。

RTX 3060
卡基于
Ampere
体系结构,兼容的
CUDA版本从11.x开始。

一旦您将tensorflow版本升级到
2.4.0
,CUDA升级到
11.0
,cuDNN升级到
8.0
,您的问题就可以得到解决


有关更多详细信息,请参考。

我在大约一周后发现了这一点,但我忘了转发我的答案。谢谢你提醒我!