TensorFlow Blas GEMM启动失败

TensorFlow Blas GEMM启动失败,tensorflow,keras,tensorflow2.0,Tensorflow,Keras,Tensorflow2.0,我正在尝试运行一个简单的CNN,我收到错误消息“Blas GEMM launch failed”。 TensorFlow 2.1.0在我的机器上设置正确,我能够成功地执行TensorFlow示例。但是,未安装TensorRT并会产生一些警告: python -c 'import tensorflow as tf; print(tf.__version__)' 2020-01-21 20:26:39.850967: W tensorflow/stream_executor/platform/def

我正在尝试运行一个简单的CNN,我收到错误消息“Blas GEMM launch failed”。 TensorFlow 2.1.0在我的机器上设置正确,我能够成功地执行TensorFlow示例。但是,未安装TensorRT并会产生一些警告:

python -c 'import tensorflow as tf; print(tf.__version__)'
2020-01-21 20:26:39.850967: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory
2020-01-21 20:26:39.851030: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory
2020-01-21 20:26:39.851040: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2.1.0
这是我得到的错误:

2020-01-21 20:21:11.549012: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-01-21 20:21:11.549233: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:11.549266: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:11.549347: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:11.549370: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:11.549452: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:11.549467: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:11.552664: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-01-21 20:21:12.266456: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:12.319531: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:12.350929: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:12.351077: E tensorflow/stream_executor/cuda/cuda_blas.cc:238] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2020-01-21 20:21:12.351089: W tensorflow/stream_executor/stream.cc:2041] attempting to perform BLAS operation using StreamExecutor without BLAS support
2020-01-21 20:21:12.351114: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Internal: Blas GEMM launch failed : a.shape=(32, 50176), b.shape=(50176, 32), m=32, n=32, k=50176
     [[{{node sequential/dense/MatMul}}]]
32/32 [==============================] - 1s 33ms/sample
Traceback (most recent call last):
  File "xcnn.py", line 27, in <module>
    history = model.fit(images, labels, epochs=1)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 342, in fit
    total_epochs=epochs)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function
    distributed_function(input_fn))
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in __call__
    result = self._call(*args, **kwds)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/eager/def_function.py", line 632, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2363, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1611, in _filtered_call
    self.captured_inputs)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
    ctx=ctx)
  File "/home/marc/tf_2/lib/python3.6/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError:  Blas GEMM launch failed : a.shape=(32, 50176), b.shape=(50176, 32), m=32, n=32, k=50176
     [[node sequential/dense/MatMul (defined at xcnn.py:27) ]] [Op:__inference_distributed_function_932]

Function call stack:
distributed_function

我不认为TensorRT警告是相关的,可能只是警告您在没有安装TensorRT的情况下不能使用tensorflow.python.compiler.TensorRT*

关于CUBLAS错误,似乎这可能是该线程上的几种解决方案之一:

  • OOM错误-限制GPU内存增长
  • 正在删除缓存文件夹(~/.nv)
  • 配置与CUDA/CUDNN版本不匹配

    • 限制GPU内存增长对我不起作用。相反,删除~/.nv的内容在我的案例中得到了解决。我不知道为什么。

      我也遇到了同样的问题,在我的情况下,重启程序可以工作:)
      import numpy as np
      
      from tensorflow.keras import layers, models
      
      IMAGE_WIDTH = 128
      IMAGE_HEIGHT = 128
      
      model = models.Sequential()
      model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(IMAGE_WIDTH,IMAGE_HEIGHT,3)))
      model.add(layers.MaxPooling2D((2, 2)))
      model.add(layers.Conv2D(64, (3, 3), activation='relu'))
      model.add(layers.MaxPooling2D((2, 2)))
      model.add(layers.Conv2D(64, (3, 3), activation='relu'))
      model.add(layers.Flatten())
      model.add(layers.Dense(32, activation='relu'))
      model.add(layers.Dense(4, activation='softmax'))
      
      model.compile(optimizer='adam',
        loss='categorical_crossentropy',
        metrics=['accuracy'])
      
      BATCH_SIZE = 32
      
      images = np.zeros((BATCH_SIZE, IMAGE_WIDTH, IMAGE_HEIGHT, 3))
      labels = np.zeros((BATCH_SIZE, 4))
      
      history = model.fit(images, labels, epochs=1)