Python Keras模型预测失败

Python Keras模型预测失败,python,tensorflow,keras,Python,Tensorflow,Keras,我正在使用TensorFlow 2.0.0及其Keras版本。 我已经成功地训练了我的大模型,批量大小为16,但当我尝试预测时,我从终端上得到了这个错误: 2020-01-27 16:54:01.298606: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll Using TensorFlow backend. 2

我正在使用TensorFlow 2.0.0及其Keras版本。 我已经成功地训练了我的大模型,批量大小为16,但当我尝试
预测
时,我从终端上得到了这个错误:

2020-01-27 16:54:01.298606: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
Using TensorFlow backend.
2020-01-27 16:54:05.417321: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2020-01-27 16:54:05.438414: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.665
pciBusID: 0000:01:00.0
2020-01-27 16:54:05.443695: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-01-27 16:54:05.448672: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-01-27 16:54:05.451627: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2020-01-27 16:54:05.459043: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2080 Ti major: 7 minor: 5 memoryClockRate(GHz): 1.665
pciBusID: 0000:01:00.0
2020-01-27 16:54:05.464931: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2020-01-27 16:54:05.468931: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2020-01-27 16:54:05.953379: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-27 16:54:05.957463: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2020-01-27 16:54:05.961678: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2020-01-27 16:54:05.965985: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 8685 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-01-27 17:03:36.790437: W tensorflow/core/grappler/optimizers/implementation_selector.cc:310] Skipping optimization due to error while loading function libraries: Invalid argument: Functions '__inference_cudnn_lstm_with_fallback_10134_specialized_for_model_sequential_bidirectional_backward_lstm_StatefulPartitionedCall_at___inference_distributed_function_15069' and '__inference_cudnn_lstm_with_fallback_10134' both implement 'lstm_157a770f-be75-4a59-a000-e5bd1949e961' but their signatures do not match.
2020-01-27 17:03:37.145999: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2020-01-27 17:03:37.404156: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2020-01-27 17:03:39.152137: E tensorflow/stream_executor/dnn.cc:588] CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(1796): 'cudnnRNNForwardTraining( cudnn.handle(), rnn_desc.handle(), model_dims.max_seq_length, input_desc.handles(), input_data.opaque(), input_h_desc.handle(), input_h_data.opaque(), input_c_desc.handle(), input_c_data.opaque(), rnn_desc.params_handle(), params.opaque(), output_desc.handles(), output_data->opaque(), output_h_desc.handle(), output_h_data->opaque(), output_c_desc.handle(), output_c_data->opaque(), workspace.opaque(), workspace.size(), reserve_space.opaque(), reserve_space.size())'
2020-01-27 17:03:39.152182: F tensorflow/stream_executor/cuda/cuda_dnn.cc:189] Check failed: status == CUDNN_STATUS_SUCCESS (7 vs. 0)Failed to set cuDNN stream.
在jupyter笔记本上运行时,内核只是崩溃,没有任何错误消息

对于
predict
,我也使用了与培训中相同的批量大小(我还尝试了较小的批量,并使用较小的数据集进行预测)


知道会发生什么吗?

在GPU上进行培训时,您以前没有遇到过问题吗?您是否更改了Tensorflow版本?嗯,我以前遇到过这个问题,这迫使我使用
load\u weights
,而不是
load\u model
,但在那之后,训练进行得很顺利。训练和预测的版本相同,我实际上是在第一个之后做第二个来计算混淆矩阵。