Python 在使用tensorflow库培训深度学习模型时,我遇到了一个错误:gpu上的ResourceExhausterRoom(128 gb RAM)请帮助我

Python 在使用tensorflow库培训深度学习模型时,我遇到了一个错误:gpu上的ResourceExhausterRoom(128 gb RAM)请帮助我,python,tensorflow,deep-learning,nlp,recurrent-neural-network,Python,Tensorflow,Deep Learning,Nlp,Recurrent Neural Network,C:\Users\CVL Acoustics\Documents\bangla句子校正母版>python train.py 坐下来放松一下,训练模特需要一些时间。。。 词汇量250000 警告:tensorflow:From C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\ops\rnn.py:417:使用seq\u dim调用反向序列(来自tensorflow.python.ops.array\u ops

C:\Users\CVL Acoustics\Documents\bangla句子校正母版>python train.py 坐下来放松一下,训练模特需要一些时间。。。 词汇量250000 警告:tensorflow:From C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\ops\rnn.py:417:使用seq\u dim调用反向序列(来自tensorflow.python.ops.array\u ops)已被弃用,并将在未来版本中删除。 更新说明: seq_dim已弃用,请改用seq_轴 警告:tensorflow:From C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\util\deprecation.py:432:使用批处理\u dim调用反向\u序列(来自tensorflow.python.ops.array\u ops)已被弃用,并将在未来版本中删除。 更新说明: 不推荐使用batch\u dim,请改用batch\u axis 警告:tensorflow:From train.py:228:softmax_cross_entropy_with_logits(来自tensorflow.python.ops.nn_ops)已弃用,将在未来版本中删除。 更新说明:

TensorFlow的未来主要版本将允许渐变流动 默认情况下,输入backprop上的标签

参见{tf.nn.softmax_cross_entropy_与_logits_v2}

第一纪元 培训回溯(最近一次呼叫最后一次): 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\client\session.py”,第1322行,在调用中 返回fn(*args) 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\client\session.py”,第1307行,在\u run\u fn中 选项、提要、获取列表、目标列表、运行元数据) 文件“C:\Users\CVL austics\Anaconda3\lib\site packages\tensorflow\python\client\session.py”,第1409行,位于调用会话运行中 运行(元数据) tensorflow.python.framework.errors\u impl.ResourceExhaustedError:OOM当使用形状[6656250000]和类型float on/job:localhost/replica:0/task:0/device:GPU:0分配程序GPU\u bfc分配tensor时 [[Node:MatMul=MatMul[T=DT_FLOAT,transpose_a=false,transpose_b=false,_device=“/job:localhost/replica:0/task:0/device:GPU:0”](重塑,变量_1/read)]] 提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息

     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息

     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
在处理上述异常期间,发生了另一个异常:

回溯(最近一次呼叫最后一次): 文件“train.py”,第321行,在 _,l=安全运行([列车运行,损失],fd) 文件“C:\Users\CVL austics\Anaconda3\lib\site packages\tensorflow\python\client\session.py”,第900行,正在运行 运行_元数据_ptr) 文件“C:\Users\CVL austics\Anaconda3\lib\site packages\tensorflow\python\client\session.py”,第1135行,正在运行 feed_dict_tensor、options、run_元数据) 文件“C:\Users\CVL austics\Anaconda3\lib\site packages\tensorflow\python\client\session.py”,第1316行,在运行中 运行(元数据) 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\client\session.py”,第1335行,在调用中 提升类型(e)(节点定义、操作、消息) tensorflow.python.framework.errors\u impl.ResourceExhaustedError:OOM当使用形状[6656250000]和类型float on/job:localhost/replica:0/task:0/device:GPU:0分配程序GPU\u bfc分配tensor时 [[Node:MatMul=MatMul[T=DT_FLOAT,transpose_a=false,transpose_b=false,_device=“/job:localhost/replica:0/task:0/device:GPU:0”](重塑,变量_1/read)]] 提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息

     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息

     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
由op“MatMul”引起,定义为: 文件“train.py”,第218行,在 解码器逻辑平坦=tf.add(tf.matmul(解码器输出平坦,W),b) 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\ops\math\u ops.py”,第2014行,matmul格式 a、 b,转置a=转置a,转置b=转置b,名称=名称) 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\ops\gen\u math\u ops.py”,第4278行,mat\u mul 名称=名称) 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\framework\op_def_library.py”,第787行,位于_apply_op_helper中 op_def=op_def) 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\framework\ops.py”,第3414行,位于create\u op op_def=op_def) 文件“C:\Users\CVL Acoustics\Anaconda3\lib\site packages\tensorflow\python\framework\ops.py”,第1740行,位于init self._traceback=self._graph._extract_stack()35; pylint:disable=protected access

ResourceExhustederRor(回溯请参见上文):当使用形状[6656250000]和类型float on/job:localhost/replica:0/任务:0/设备:GPU:0由分配器GPU\U 0\bfc分配张量时,OOM [[Node:MatMul=MatMul[T=DT_FLOAT,transpose_a=false,transpose_b=false,_device=“/job:localhost/replica:0/task:0/device:GPU:0”](重塑,变量_1/read)]] 提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息

     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]

提示:如果要在OOM发生时查看已分配的张量列表,请将“报告”张量“分配”添加到当前分配信息的RunOptions中。

出现这种情况的原因有几个

     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
     [[Node: rnn/while/cond/Add/_87 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_421_rnn/while/cond/Add", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](^_clooprnn/while/cond/ArgMax/dimension/_1)]]
  • 尝试减少网络的参数
  • 尝试减少批处理大小
  • 检查当前是否有另一个正在分配内存的内核处于活动状态

听起来好像内存不足。您能帮我解决这个问题吗@JammyD