Python 没有CUDA支持,无法在带有GPU的计算机上安装Keras MXNet
我正在明确尝试在不支持CUDA的情况下安装mxnet版本。在支持CUDA的情况下安装时,我可以运行以下命令: 复制成功启用CUDA的keras mxnet的步骤: 以下是我的gpu配置,来自Python 没有CUDA支持,无法在带有GPU的计算机上安装Keras MXNet,python,keras,gpu,mxnet,Python,Keras,Gpu,Mxnet,我正在明确尝试在不支持CUDA的情况下安装mxnet版本。在支持CUDA的情况下安装时,我可以运行以下命令: 复制成功启用CUDA的keras mxnet的步骤: 以下是我的gpu配置,来自nvcc--version: ~# nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2016 NVIDIA Corporation Built on Tue_Jan_10_13:22:03_CST_2017 Cud
nvcc--version
:
~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
确保未安装mxnet
pip install mxnet-cu80
pip install keras-mxnet
在jupyter上运行代码会让我:
60000 train samples
10000 test samples
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 512) 401920
_________________________________________________________________
activation_1 (Activation) (None, 512) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 512) 262656
_________________________________________________________________
activation_2 (Activation) (None, 512) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 512) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 5130
_________________________________________________________________
activation_3 (Activation) (None, 10) 0
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
6400/60000 [==>...........................] - ETA: 39s - loss: 2.1718 - acc: 0.2587
/usr/local/lib/python3.6/dist-packages/mxnet/module/bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.0078125). Is this intended?
force_init=force_init)
60000/60000 [==============================] - 6s 103us/step - loss: 1.2105 - acc: 0.6957 - val_loss: 0.5334 - val_acc: 0.8728
Epoch 2/20
60000/60000 [==============================] - 2s 27us/step - loss: 0.5280 - acc: 0.8515 - val_loss: 0.3749 - val_acc: 0.8996
Epoch 3/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.4239 - acc: 0.8786 - val_loss: 0.3213 - val_acc: 0.9098
Epoch 4/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.3740 - acc: 0.8911 - val_loss: 0.2923 - val_acc: 0.9162
Epoch 5/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.3437 - acc: 0.9008 - val_loss: 0.2704 - val_acc: 0.9218
Epoch 6/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.3195 - acc: 0.9079 - val_loss: 0.2539 - val_acc: 0.9263
Epoch 7/20
60000/60000 [==============================] - 2s 29us/step - loss: 0.2965 - acc: 0.9151 - val_loss: 0.2393 - val_acc: 0.9312
Epoch 8/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2792 - acc: 0.9190 - val_loss: 0.2264 - val_acc: 0.9342
Epoch 9/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2641 - acc: 0.9239 - val_loss: 0.2173 - val_acc: 0.9363
Epoch 10/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2520 - acc: 0.9277 - val_loss: 0.2064 - val_acc: 0.9413
Epoch 11/20
60000/60000 [==============================] - 2s 29us/step - loss: 0.2409 - acc: 0.9306 - val_loss: 0.1983 - val_acc: 0.9425
Epoch 12/20
60000/60000 [==============================] - 2s 30us/step - loss: 0.2307 - acc: 0.9331 - val_loss: 0.1894 - val_acc: 0.9447
Epoch 13/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2209 - acc: 0.9362 - val_loss: 0.1813 - val_acc: 0.9463
Epoch 14/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2106 - acc: 0.9396 - val_loss: 0.1756 - val_acc: 0.9478
Epoch 15/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2044 - acc: 0.9410 - val_loss: 0.1687 - val_acc: 0.9501
Epoch 16/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1963 - acc: 0.9424 - val_loss: 0.1625 - val_acc: 0.9528
Epoch 17/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1912 - acc: 0.9436 - val_loss: 0.1576 - val_acc: 0.9542
Epoch 18/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1842 - acc: 0.9472 - val_loss: 0.1544 - val_acc: 0.9541
Epoch 19/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1782 - acc: 0.9482 - val_loss: 0.1490 - val_acc: 0.9553
Epoch 20/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1729 - acc: 0.9494 - val_loss: 0.1447 - val_acc: 0.9570
Test score: 0.144698123593
Test accuracy: 0.957
60000 train samples
10000 test samples
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_4 (Dense) (None, 512) 401920
_________________________________________________________________
activation_4 (Activation) (None, 512) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 512) 0
_________________________________________________________________
dense_5 (Dense) (None, 512) 262656
_________________________________________________________________
activation_5 (Activation) (None, 512) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 512) 0
_________________________________________________________________
dense_6 (Dense) (None, 10) 5130
_________________________________________________________________
activation_6 (Activation) (None, 10) 0
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
---------------------------------------------------------------------------
MXNetError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/mxnet/symbol/symbol.py in simple_bind(self, ctx, grad_req, type_dict, stype_dict, group2ctx, shared_arg_names, shared_exec, shared_buffer, **kwargs)
1512 shared_exec_handle,
-> 1513 ctypes.byref(exe_handle)))
1514 except MXNetError as e:
/usr/local/lib/python3.6/dist-packages/mxnet/base.py in check_call(ret)
148 if ret != 0:
--> 149 raise MXNetError(py_str(_LIB.MXGetLastError()))
150
MXNetError: [04:19:54] src/storage/storage.cc:123: Compile with USE_CUDA=1 to enable GPU usage
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c05f2) [0x7f737ac845f2]
[bt] (1) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c0bd8) [0x7f737ac84bd8]
[bt] (2) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d7d3cd) [0x7f737d8413cd]
[bt] (3) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d8141d) [0x7f737d84541d]
[bt] (4) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d83206) [0x7f737d847206]
[bt] (5) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2831) [0x7f737d266831]
[bt] (6) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2984) [0x7f737d266984]
[bt] (7) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27aecec) [0x7f737d272cec]
[bt] (8) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27b55f8) [0x7f737d2795f8]
[bt] (9) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27c163a) [0x7f737d28563a]
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
<ipython-input-4-c71d8965f0f3> in <module>()
49 history = model.fit(X_train, Y_train,
50 batch_size=batch_size, epochs=nb_epoch,
---> 51 verbose=1, validation_data=(X_test, Y_test))
52 score = model.evaluate(X_test, Y_test, verbose=0)
53 print('Test score:', score[0])
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
1042 initial_epoch=initial_epoch,
1043 steps_per_epoch=steps_per_epoch,
-> 1044 validation_steps=validation_steps)
1045
1046 def evaluate(self, x=None, y=None,
/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py in fit_loop(model, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps)
197 ins_batch[i] = ins_batch[i].toarray()
198
--> 199 outs = f(ins_batch)
200 if not isinstance(outs, list):
201 outs = [outs]
/usr/local/lib/python3.6/dist-packages/keras/backend/mxnet_backend.py in train_function(inputs)
4794 def train_function(inputs):
4795 self._check_trainable_weights_consistency()
-> 4796 data, label, _, data_shapes, label_shapes = self._adjust_module(inputs, 'train')
4797
4798 batch = mx.io.DataBatch(data=data, label=label, bucket_key='train',
/usr/local/lib/python3.6/dist-packages/keras/backend/mxnet_backend.py in _adjust_module(self, inputs, phase)
4746 self._set_weights()
4747 else:
-> 4748 self._module.bind(data_shapes=data_shapes, label_shapes=None, for_training=True)
4749 self._set_weights()
4750 self._module.init_optimizer(kvstore=self._kvstore, optimizer=self.optimizer)
/usr/local/lib/python3.6/dist-packages/mxnet/module/bucketing_module.py in bind(self, data_shapes, label_shapes, for_training, inputs_need_grad, force_rebind, shared_module, grad_req)
341 compression_params=self._compression_params)
342 module.bind(data_shapes, label_shapes, for_training, inputs_need_grad,
--> 343 force_rebind=False, shared_module=None, grad_req=grad_req)
344 self._curr_module = module
345 self._curr_bucket_key = self._default_bucket_key
/usr/local/lib/python3.6/dist-packages/mxnet/module/module.py in bind(self, data_shapes, label_shapes, for_training, inputs_need_grad, force_rebind, shared_module, grad_req)
428 fixed_param_names=self._fixed_param_names,
429 grad_req=grad_req, group2ctxs=self._group2ctxs,
--> 430 state_names=self._state_names)
431 self._total_exec_bytes = self._exec_group._total_exec_bytes
432 if shared_module is not None:
/usr/local/lib/python3.6/dist-packages/mxnet/module/executor_group.py in __init__(self, symbol, contexts, workload, data_shapes, label_shapes, param_names, for_training, inputs_need_grad, shared_group, logger, fixed_param_names, grad_req, state_names, group2ctxs)
263 self.num_outputs = len(self.symbol.list_outputs())
264
--> 265 self.bind_exec(data_shapes, label_shapes, shared_group)
266
267 def decide_slices(self, data_shapes):
/usr/local/lib/python3.6/dist-packages/mxnet/module/executor_group.py in bind_exec(self, data_shapes, label_shapes, shared_group, reshape)
359 else:
360 self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i,
--> 361 shared_group))
362
363 self.data_shapes = data_shapes
/usr/local/lib/python3.6/dist-packages/mxnet/module/executor_group.py in _bind_ith_exec(self, i, data_shapes, label_shapes, shared_group)
637 type_dict=input_types, shared_arg_names=self.param_names,
638 shared_exec=shared_exec, group2ctx=group2ctx,
--> 639 shared_buffer=shared_data_arrays, **input_shapes)
640 self._total_exec_bytes += int(executor.debug_str().split('\n')[-3].split()[1])
641 return executor
/usr/local/lib/python3.6/dist-packages/mxnet/symbol/symbol.py in simple_bind(self, ctx, grad_req, type_dict, stype_dict, group2ctx, shared_arg_names, shared_exec, shared_buffer, **kwargs)
1517 error_msg += "%s: %s\n" % (k, v)
1518 error_msg += "%s" % e
-> 1519 raise RuntimeError(error_msg)
1520
1521 # update shared_buffer
RuntimeError: simple_bind error. Arguments:
/dense_4_input1: (128, 784)
[04:19:54] src/storage/storage.cc:123: Compile with USE_CUDA=1 to enable GPU usage
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c05f2) [0x7f737ac845f2]
[bt] (1) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c0bd8) [0x7f737ac84bd8]
[bt] (2) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d7d3cd) [0x7f737d8413cd]
[bt] (3) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d8141d) [0x7f737d84541d]
[bt] (4) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d83206) [0x7f737d847206]
[bt] (5) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2831) [0x7f737d266831]
[bt] (6) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2984) [0x7f737d266984]
[bt] (7) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27aecec) [0x7f737d272cec]
[bt] (8) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27b55f8) [0x7f737d2795f8]
[bt] (9) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27c163a) [0x7f737d28563a]
复制失败的仅CPU keras mxnet的步骤:
执行与之前相同的操作,但不要安装mxnet-cu80
,而是安装mxnet
:
pip uninstall mxnet-cu80
pip install mxnet
现在,在jupyter笔记本上运行代码可以让我:
60000 train samples
10000 test samples
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 512) 401920
_________________________________________________________________
activation_1 (Activation) (None, 512) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 512) 0
_________________________________________________________________
dense_2 (Dense) (None, 512) 262656
_________________________________________________________________
activation_2 (Activation) (None, 512) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 512) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 5130
_________________________________________________________________
activation_3 (Activation) (None, 10) 0
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
6400/60000 [==>...........................] - ETA: 39s - loss: 2.1718 - acc: 0.2587
/usr/local/lib/python3.6/dist-packages/mxnet/module/bucketing_module.py:408: UserWarning: Optimizer created manually outside Module but rescale_grad is not normalized to 1.0/batch_size/num_workers (1.0 vs. 0.0078125). Is this intended?
force_init=force_init)
60000/60000 [==============================] - 6s 103us/step - loss: 1.2105 - acc: 0.6957 - val_loss: 0.5334 - val_acc: 0.8728
Epoch 2/20
60000/60000 [==============================] - 2s 27us/step - loss: 0.5280 - acc: 0.8515 - val_loss: 0.3749 - val_acc: 0.8996
Epoch 3/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.4239 - acc: 0.8786 - val_loss: 0.3213 - val_acc: 0.9098
Epoch 4/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.3740 - acc: 0.8911 - val_loss: 0.2923 - val_acc: 0.9162
Epoch 5/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.3437 - acc: 0.9008 - val_loss: 0.2704 - val_acc: 0.9218
Epoch 6/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.3195 - acc: 0.9079 - val_loss: 0.2539 - val_acc: 0.9263
Epoch 7/20
60000/60000 [==============================] - 2s 29us/step - loss: 0.2965 - acc: 0.9151 - val_loss: 0.2393 - val_acc: 0.9312
Epoch 8/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2792 - acc: 0.9190 - val_loss: 0.2264 - val_acc: 0.9342
Epoch 9/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2641 - acc: 0.9239 - val_loss: 0.2173 - val_acc: 0.9363
Epoch 10/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2520 - acc: 0.9277 - val_loss: 0.2064 - val_acc: 0.9413
Epoch 11/20
60000/60000 [==============================] - 2s 29us/step - loss: 0.2409 - acc: 0.9306 - val_loss: 0.1983 - val_acc: 0.9425
Epoch 12/20
60000/60000 [==============================] - 2s 30us/step - loss: 0.2307 - acc: 0.9331 - val_loss: 0.1894 - val_acc: 0.9447
Epoch 13/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2209 - acc: 0.9362 - val_loss: 0.1813 - val_acc: 0.9463
Epoch 14/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2106 - acc: 0.9396 - val_loss: 0.1756 - val_acc: 0.9478
Epoch 15/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.2044 - acc: 0.9410 - val_loss: 0.1687 - val_acc: 0.9501
Epoch 16/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1963 - acc: 0.9424 - val_loss: 0.1625 - val_acc: 0.9528
Epoch 17/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1912 - acc: 0.9436 - val_loss: 0.1576 - val_acc: 0.9542
Epoch 18/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1842 - acc: 0.9472 - val_loss: 0.1544 - val_acc: 0.9541
Epoch 19/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1782 - acc: 0.9482 - val_loss: 0.1490 - val_acc: 0.9553
Epoch 20/20
60000/60000 [==============================] - 2s 28us/step - loss: 0.1729 - acc: 0.9494 - val_loss: 0.1447 - val_acc: 0.9570
Test score: 0.144698123593
Test accuracy: 0.957
60000 train samples
10000 test samples
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_4 (Dense) (None, 512) 401920
_________________________________________________________________
activation_4 (Activation) (None, 512) 0
_________________________________________________________________
dropout_3 (Dropout) (None, 512) 0
_________________________________________________________________
dense_5 (Dense) (None, 512) 262656
_________________________________________________________________
activation_5 (Activation) (None, 512) 0
_________________________________________________________________
dropout_4 (Dropout) (None, 512) 0
_________________________________________________________________
dense_6 (Dense) (None, 10) 5130
_________________________________________________________________
activation_6 (Activation) (None, 10) 0
=================================================================
Total params: 669,706
Trainable params: 669,706
Non-trainable params: 0
_________________________________________________________________
Train on 60000 samples, validate on 10000 samples
Epoch 1/20
---------------------------------------------------------------------------
MXNetError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/mxnet/symbol/symbol.py in simple_bind(self, ctx, grad_req, type_dict, stype_dict, group2ctx, shared_arg_names, shared_exec, shared_buffer, **kwargs)
1512 shared_exec_handle,
-> 1513 ctypes.byref(exe_handle)))
1514 except MXNetError as e:
/usr/local/lib/python3.6/dist-packages/mxnet/base.py in check_call(ret)
148 if ret != 0:
--> 149 raise MXNetError(py_str(_LIB.MXGetLastError()))
150
MXNetError: [04:19:54] src/storage/storage.cc:123: Compile with USE_CUDA=1 to enable GPU usage
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c05f2) [0x7f737ac845f2]
[bt] (1) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c0bd8) [0x7f737ac84bd8]
[bt] (2) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d7d3cd) [0x7f737d8413cd]
[bt] (3) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d8141d) [0x7f737d84541d]
[bt] (4) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d83206) [0x7f737d847206]
[bt] (5) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2831) [0x7f737d266831]
[bt] (6) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2984) [0x7f737d266984]
[bt] (7) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27aecec) [0x7f737d272cec]
[bt] (8) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27b55f8) [0x7f737d2795f8]
[bt] (9) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27c163a) [0x7f737d28563a]
During handling of the above exception, another exception occurred:
RuntimeError Traceback (most recent call last)
<ipython-input-4-c71d8965f0f3> in <module>()
49 history = model.fit(X_train, Y_train,
50 batch_size=batch_size, epochs=nb_epoch,
---> 51 verbose=1, validation_data=(X_test, Y_test))
52 score = model.evaluate(X_test, Y_test, verbose=0)
53 print('Test score:', score[0])
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, **kwargs)
1042 initial_epoch=initial_epoch,
1043 steps_per_epoch=steps_per_epoch,
-> 1044 validation_steps=validation_steps)
1045
1046 def evaluate(self, x=None, y=None,
/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py in fit_loop(model, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps)
197 ins_batch[i] = ins_batch[i].toarray()
198
--> 199 outs = f(ins_batch)
200 if not isinstance(outs, list):
201 outs = [outs]
/usr/local/lib/python3.6/dist-packages/keras/backend/mxnet_backend.py in train_function(inputs)
4794 def train_function(inputs):
4795 self._check_trainable_weights_consistency()
-> 4796 data, label, _, data_shapes, label_shapes = self._adjust_module(inputs, 'train')
4797
4798 batch = mx.io.DataBatch(data=data, label=label, bucket_key='train',
/usr/local/lib/python3.6/dist-packages/keras/backend/mxnet_backend.py in _adjust_module(self, inputs, phase)
4746 self._set_weights()
4747 else:
-> 4748 self._module.bind(data_shapes=data_shapes, label_shapes=None, for_training=True)
4749 self._set_weights()
4750 self._module.init_optimizer(kvstore=self._kvstore, optimizer=self.optimizer)
/usr/local/lib/python3.6/dist-packages/mxnet/module/bucketing_module.py in bind(self, data_shapes, label_shapes, for_training, inputs_need_grad, force_rebind, shared_module, grad_req)
341 compression_params=self._compression_params)
342 module.bind(data_shapes, label_shapes, for_training, inputs_need_grad,
--> 343 force_rebind=False, shared_module=None, grad_req=grad_req)
344 self._curr_module = module
345 self._curr_bucket_key = self._default_bucket_key
/usr/local/lib/python3.6/dist-packages/mxnet/module/module.py in bind(self, data_shapes, label_shapes, for_training, inputs_need_grad, force_rebind, shared_module, grad_req)
428 fixed_param_names=self._fixed_param_names,
429 grad_req=grad_req, group2ctxs=self._group2ctxs,
--> 430 state_names=self._state_names)
431 self._total_exec_bytes = self._exec_group._total_exec_bytes
432 if shared_module is not None:
/usr/local/lib/python3.6/dist-packages/mxnet/module/executor_group.py in __init__(self, symbol, contexts, workload, data_shapes, label_shapes, param_names, for_training, inputs_need_grad, shared_group, logger, fixed_param_names, grad_req, state_names, group2ctxs)
263 self.num_outputs = len(self.symbol.list_outputs())
264
--> 265 self.bind_exec(data_shapes, label_shapes, shared_group)
266
267 def decide_slices(self, data_shapes):
/usr/local/lib/python3.6/dist-packages/mxnet/module/executor_group.py in bind_exec(self, data_shapes, label_shapes, shared_group, reshape)
359 else:
360 self.execs.append(self._bind_ith_exec(i, data_shapes_i, label_shapes_i,
--> 361 shared_group))
362
363 self.data_shapes = data_shapes
/usr/local/lib/python3.6/dist-packages/mxnet/module/executor_group.py in _bind_ith_exec(self, i, data_shapes, label_shapes, shared_group)
637 type_dict=input_types, shared_arg_names=self.param_names,
638 shared_exec=shared_exec, group2ctx=group2ctx,
--> 639 shared_buffer=shared_data_arrays, **input_shapes)
640 self._total_exec_bytes += int(executor.debug_str().split('\n')[-3].split()[1])
641 return executor
/usr/local/lib/python3.6/dist-packages/mxnet/symbol/symbol.py in simple_bind(self, ctx, grad_req, type_dict, stype_dict, group2ctx, shared_arg_names, shared_exec, shared_buffer, **kwargs)
1517 error_msg += "%s: %s\n" % (k, v)
1518 error_msg += "%s" % e
-> 1519 raise RuntimeError(error_msg)
1520
1521 # update shared_buffer
RuntimeError: simple_bind error. Arguments:
/dense_4_input1: (128, 784)
[04:19:54] src/storage/storage.cc:123: Compile with USE_CUDA=1 to enable GPU usage
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c05f2) [0x7f737ac845f2]
[bt] (1) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c0bd8) [0x7f737ac84bd8]
[bt] (2) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d7d3cd) [0x7f737d8413cd]
[bt] (3) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d8141d) [0x7f737d84541d]
[bt] (4) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d83206) [0x7f737d847206]
[bt] (5) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2831) [0x7f737d266831]
[bt] (6) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2984) [0x7f737d266984]
[bt] (7) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27aecec) [0x7f737d272cec]
[bt] (8) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27b55f8) [0x7f737d2795f8]
[bt] (9) /usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27c163a) [0x7f737d28563a]
60000列车样本
10000个测试样本
_________________________________________________________________
层(类型)输出形状参数
=================================================================
密集型_4(密集型)(无,512)4011920
_________________________________________________________________
激活_4(激活)(无,512)0
_________________________________________________________________
辍学3(辍学)(无,512)0
_________________________________________________________________
致密(致密)(无,512)262656
_________________________________________________________________
激活_5(激活)(无,512)0
_________________________________________________________________
辍学4(辍学)(无,512)0
_________________________________________________________________
致密_6(致密)(无,10)5130
_________________________________________________________________
激活_6(激活)(无,10)0
=================================================================
总参数:669706
可培训参数:669706
不可训练参数:0
_________________________________________________________________
培训60000个样本,验证10000个样本
纪元1/20
---------------------------------------------------------------------------
MXNetError回溯(最近一次调用上次)
/简单绑定中的usr/local/lib/python3.6/dist-packages/mxnet/symbol/symbol.py(self、ctx、grad_-req、type_-dict、stype_-dict、group2ctx、shared_-arg_名称、shared_-exec、shared_-buffer、**kwargs)
1512共享执行句柄,
->1513 ctypes.byref(exe_handle)))
1514除MXNetError外,其他错误为e:
/检查调用(ret)中的usr/local/lib/python3.6/dist-packages/mxnet/base.py
148如果是ret!=0:
-->149 raise MXNetError(py_str(_LIB.MXGetLastError()))
150
MXNetError:[04:19:54]src/storage/storage.cc:123:使用USE_CUDA=1编译以启用GPU使用
堆栈跟踪返回了10个条目:
[bt](0)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c05f2)[0x7f737ac845f2]
[bt](1)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x1c0bd8)[0x7f737ac84bd8]
[bt](2)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d7d3cd)[0x7f737d8413cd]
[bt](3)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d8141d)[0x7f737d84541d]
[bt](4)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x2d83206)[0x7f737d847206]
[bt](5)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2831)[0x7f737d266831]
[bt](6)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27a2984)[0x7f737d266984]
[bt](7)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27aecec)[0x7f737d272cec]
[bt](8)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27b55f8)[0x7f737d27795f8]
[bt](9)/usr/local/lib/python3.6/dist-packages/mxnet/libmxnet.so(+0x27c163a)[0x7f737d28563a]
在处理上述异常期间,发生了另一个异常:
运行时错误回溯(上次最近调用)
在()
49历史=模型拟合(X_系列、Y_系列、,
50批次大小=批次大小,历元=nb历元,
--->51详细=1,验证数据=(X_检验,Y_检验)
52分=模型评估(X_检验、Y_检验、详细度=0)
53打印('测试分数:',分数[0])
/usr/local/lib/python3.6/dist-packages/keras/engine/training.py in-fit(self、x、y、批量大小、历元、冗余、回调、验证分割、验证数据、混洗、类权重、样本权重、初始历元、每历元的步骤、验证步骤、**kwargs)
1042初始纪元=初始纪元,
1043步/u历元=步/u历元,
->1044验证步骤=验证步骤)
1045
1046 def评估(自我,x=无,y=无,
/usr/local/lib/python3.6/dist-packages/keras/engine/training_arrays.py in-fit_循环(模型、f、ins、out_标签、批量大小、历元、冗余、回调、val_f、val_-ins、随机、回调度量、初始历元、每个历元的步骤、验证步骤)
197 ins_批次[i]=ins_批次[i].toarray()
198
-->199输出=f(输入/输出批次)
200如果不存在(输出,列表):
201出局=[出局]
/列函数中的usr/local/lib/python3.6/dist-packages/keras/backend/mxnet\u backend.py(输入)
4794 def系列功能(输入):
4795自我检查可训练重量一致性()
->4796数据,标签,数据形状,标签形状=自调整模块(输入,“训练”)
4797
4798 batch=mx.io.DataBatch(数据=数据,标签=标签,bucket_key='train',
/usr/local/lib/python3.6/dist-packages/keras/backend/mxnet_backend.py in_adjust_模块(自我、输入、阶段)
4746自我设置权重()
474