Tensorflow 通用语句编码器嵌入运行时错误:分配器CPU耗尽资源
给出以下代码:Tensorflow 通用语句编码器嵌入运行时错误:分配器CPU耗尽资源,tensorflow,runtime-error,tensorflow-hub,Tensorflow,Runtime Error,Tensorflow Hub,给出以下代码: import tensorflow_hub as hub import tensorflow.compat.v1 as tf tf.disable_eager_execution() with tf.Session() as sess: embed=hub.load("https://tfhub.dev/google/universal-sentence-encoder/4") sess.run([tf.global_variables_initializer(), tf
import tensorflow_hub as hub
import tensorflow.compat.v1 as tf
tf.disable_eager_execution()
with tf.Session() as sess:
embed=hub.load("https://tfhub.dev/google/universal-sentence-encoder/4")
sess.run([tf.global_variables_initializer(), tf.tables_initializer()])
sess.run(embed(["test message"]))
我成功地在本地Ubuntu机器CPU i5和3 GB RAM上运行了它,但是当我尝试在带有4GB RAM和uname详细信息的VPS centOS上运行它时:Linux 2.6.32-042stab141.3 MSK 2019 x8664,我得到以下错误:
2020-06-19 14:15:10.612770: W tensorflow/core/framework/op_kernel.cc:1753] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Resource exhausted: OOM when allocating tensor with shape[26667,320] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
Traceback (most recent call last):
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1365, in _do_call
return fn(*args)
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1349, in _run_fn
return self._call_tf_sessionrun(options, feed_dict, fetch_list,
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1441, in _call_tf_sessionrun
return tf_session.TF_SessionRun_wrapper(self._session, options, feed_dict,
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[26667,320] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu [[{{node RestoreV2}}]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "testEncoder.py", line 12, in <module>
get_features(["one message"])
File "testEncoder.py", line 5, in get_features
sess.run([tf.global_variables_initializer(), tf.tables_initializer()])
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 957, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1180, in _run
results = self._do_run(handle, final_targets, final_fetches,
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1358, in _do_run
return self._do_call(_run_fn, feeds, fetches, targets, options,
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py", line 1384, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[26667,320] and type float on /job:localhost/replica:0/task:0/device:CPU:0 by allocator cpu
[[node RestoreV2 (defined at /home/userM/.local/lib/python3.8/site-packages/tensorflow_hub/module_v2.py:102) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
Original stack trace for 'RestoreV2':
File "testEncoder.py", line 10, in <module>
embed=hub.load("universal_sentence_encoder")
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow_hub/module_v2.py", line 102, in load
obj = tf_v1.saved_model.load_v2(module_path, tags=tags)
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 578, in load
return load_internal(export_dir, tags)
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 602, in load_internal
loader = loader_cls(object_graph_proto,
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 124, in __init__
self._restore_checkpoint()
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/saved_model/load.py", line 310, in _restore_checkpoint
load_status = saver.restore(variables_path)
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 1303, in restore
base.CheckpointPosition(
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 209, in restore
restore_ops = trackable._restore_from_checkpoint_position(self) # pylint: disable=protected-access
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/base.py", line 906, in _restore_from_checkpoint_position
current_position.checkpoint.restore_saveables(
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/training/tracking/util.py", line 288, in restore_saveables
new_restore_ops = functional_saver.MultiDeviceSaver(
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 281, in restore
restore_ops.update(saver.restore(file_prefix))
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/training/saving/functional_saver.py", line 95, in restore
restored_tensors = io_ops.restore_v2(
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1503, in restore_v2
_, _, _op, _outputs = _op_def_library._apply_op_helper(
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/framework/op_def_library.py", line 742, in _apply_op_helper
op = g._create_op_internal(op_type_name, inputs, dtypes=None,
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 3319, in _create_op_internal
ret = Operation(
File "/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 1791, in __init__
self._traceback = tf_stack.extract_stack()
2020-06-19 14:15:10.612770:W tensorflow/core/framework/op_kernel.cc:1753]op_REQUIRES在save_restore_v2_ops时失败。cc:184:Resource expensed:OOM在使用形状[26667320]和类型float on/job:localhost/replica:0/任务:0/设备:CPU:0由分配器CPU分配
回溯(最近一次呼叫最后一次):
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/client/session.py”,第1365行,在调用中
返回fn(*args)
文件“/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py”,第1349行,在
返回self.\u调用\u tf\u sessionrun(选项、提要、获取列表、,
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/client/session.py”,第1441行,位于调用会话运行中
返回tf_session.tf_SessionRun_包装器(self._session,options,feed_dict,
tensorflow.python.framework.errors_impl.resourceExhustederRor:OOM当使用形状[26667320]和类型float on/job:localhost/replica:0/task:0/device:CPU:0由分配器CPU[{{node RestoreV2}}]]分配tensor时
提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息。
在处理上述异常期间,发生了另一个异常:
回溯(最近一次呼叫最后一次):
文件“testincoder.py”,第12行,在
获取功能([“一条消息”])
get_features中第5行的文件“testincoder.py”
sess.run([tf.global\u variables\u initializer(),tf.tables\u initializer()]))
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/client/session.py”,第957行,正在运行
结果=self.\u运行(无、取回、馈送、选项、,
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/client/session.py”,第1180行,在运行时
结果=自运行(句柄、最终目标、最终获取、,
文件“/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py”,第1358行,在运行中
返回self.\u do\u call(\u run\u fn、提要、获取、目标、选项、,
文件“/home/userM/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py”,第1384行,在
提升类型(e)(节点定义、操作、消息)
tensorflow.python.framework.errors\u impl.resourceexthustederRor:OOM当使用形状[26667320]和类型float on/job:localhost/replica:0/任务:0/设备:CPU:0分配程序CPU分配tensor时
[[node RestoreV2(在/home/userM/.local/lib/python3.8/site packages/tensorflow\u hub/module\u v2.py:102中定义)]]
提示:如果您想在OOM发生时查看已分配的张量列表,请在OOM上添加report_tensor_allocations_on_to RunOptions以获取当前分配信息。
“RestoreV2”的原始堆栈跟踪:
文件“testincoder.py”,第10行,在
嵌入=hub.load(“通用编码器”)
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow\u hub/module\u v2.py”,第102行,已加载
obj=tf\U v1.保存的\U模型.加载\U v2(模块路径,标签=标签)
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/saved_model/load.py”,第578行,在load中
返回加载\内部(导出\目录,标签)
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/saved_model/load.py”,第602行,在load_internal中
loader=loader\u cls(对象\u图形\u原型,
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/saved_model/load.py”,第124行,在u init中__
self.\u还原\u检查点()
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/saved_model/load.py”,第310行,在_restore_checkpoint中
load\u status=saver.restore(变量\u路径)
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/training/tracking/util.py”,第1303行,在restore中
基本位置(
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/training/tracking/base.py”,第209行,在restore中
restore_ops=可跟踪。_restore_from_checkpoint_position(self)#pylint:disable=受保护的访问
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/training/tracking/base.py”,第906行,从检查点位置还原
当前位置.checkpoint.restore\u可保存项(
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/training/tracking/util.py”,第288行,在restore\u saveables中
new_restore_ops=功能性_saver.MultiDeviceSaver(
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/training/saving/functional_saver.py”,第281行,在restore中
restore\u ops.update(saver.restore(文件前缀))
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/training/saving/functional_saver.py”,第95行,在restore中
还原张量=io_ops.restore_v2(
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/ops/gen_io_ops.py”,第1503行,在restore_v2中
_,u,_op,_outputs=_op_def_库。_apply_op_helper(
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/framework/op_def_library.py”,第742行,在(应用)op_helper中
op=g._创建_op_内部(op_类型_名称、输入、数据类型=None,
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/framework/ops.py”,第3319行,在_create_op_internal中
ret=操作(
文件“/home/userM/.local/lib/python3.8/site packages/tensorflow/python/framework/ops.py”,第1791行,在__
self.\u traceback=tf\u stack.extract\u stack()
我读到,在使用tf.keras.backend.clear_session()训练模型之后清除会话可能是一种修复方法,但实际上并非如此,因为事实并非如此。
你知道是什么导致了这个错误吗?以及如何避免