Python Tensorflow错误-“;容器localhost不存在。(找不到资源:localhost/_AnonymousVar0)";

Python Tensorflow错误-“;容器localhost不存在。(找不到资源:localhost/_AnonymousVar0)";,python,tensorflow,keras,deep-learning,Python,Tensorflow,Keras,Deep Learning,我正在尝试在笔记本电脑本地使用不同的数据集运行。不幸的是,我得到的某个容器localhost不存在。(找不到资源:localhost/\u AnonymousVar0)。(我认为这是主要错误,但我可能错了) 不寻常的是,只有在模型经过几次训练后,我才出现这个错误 下面是整个日志:(我已经修剪了日志的上半部分,其中显示了tensorflow的初始化,没有显示错误/警告) 列车7290步 纪元1/15 2020-05-28 22:57:18.046206:I tensorflow/stream_ex

我正在尝试在笔记本电脑本地使用不同的数据集运行。不幸的是,我得到的某个
容器localhost不存在。(找不到资源:localhost/\u AnonymousVar0)
。(我认为这是主要错误,但我可能错了)

不寻常的是,只有在模型经过几次训练后,我才出现这个错误

下面是整个日志:(我已经修剪了日志的上半部分,其中显示了tensorflow的初始化,没有显示错误/警告)

列车7290步
纪元1/15
2020-05-28 22:57:18.046206:I tensorflow/stream_executor/platform/default/dso_loader.cc:44]已成功打开动态库libcublas.so.10
7290/7290[==========================================]1986s 272ms/步长-损耗:2.0052-精度:0.0939
纪元2/15
7290/7290[====================================]1971s 270ms/步长-损耗:1.6234-精度:0.1223
纪元3/15
7290/7290[====================================]-1968s 270ms/步长-损耗:1.5535-精度:0.1291
纪元4/15
7290/7290[======================================================]1968s 270ms/步-损耗:1.5192-精度:0.1325
纪元5/15
7290/7290[===================================================]1968s 270ms/步长-损耗:1.4978-精度:0.1348
纪元6/15
7290/7290[====================================]-1967s 270ms/步长-损耗:1.4825-精度:0.1364
纪元7/15
7290/7290[====================================]-1967s 270ms/步长-损耗:1.4711-精度:0.1376
纪元8/15
7290/7290[==========================================]1966s 270ms/步长-损耗:1.4621-精度:0.1386
纪元9/15
174/7290[……]-ETA:32:11-损失:1.4382-准确度:0.13312020-05-29 03:20:43.528885:W tensorflow/core/framework/op_kernel.cc:1655]op_REQUIRES在资源_变量_ops处失败。cc:540:未找到:容器本地主机不存在。(找不到资源:localhost/\u AnonymousVar0)
2020-05-29 03:20:43.528953:W tensorflow/core/common_runtime/base_collective_executor.cc:217]BaseCollectiveExecutor::StartPort未找到:容器本地主机不存在。(找不到资源:localhost/\u AnonymousVar0)
[{{node Adam/Adam/update/AssignSubVariableOp}}]]
[[GroupCrossDeviceControlledges_0/Adam/Adam/Const/_301]]
2020-05-29 03:20:43.529025:W tensorflow/core/common_runtime/base_collective_executor.cc:217]BaseCollectiveExecutor::StartPort未找到:容器本地主机不存在。(找不到资源:localhost/\u AnonymousVar0)
[{{node Adam/Adam/update/AssignSubVariableOp}}]]
175/7290[预计到达时间:32:14-损失:1.4382-准确度:0.1331回溯(最近一次呼叫):
文件“model.py”,第114行,在
model.fit(数据集,epochs=epochs)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/keras/engine/training.py”,第819行,以适合的形式
使用多处理=使用多处理)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/keras/engine/training_v2.py”,第342行
总(单位时间=时间)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/keras/engine/training_v2.py”,第128行,在run_one_中
批处理输出=执行函数(迭代器)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow\u core/python/keras/engine/training\u v2\u utils.py”,第98行,在函数执行中
分布函数(输入函数)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/eager/def_function.py”,第568行,在调用中__
结果=自身调用(*args,**kwds)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/eager/def_function.py”,第599行,在
返回self._无状态_fn(*args,**kwds)35; pylint:disable=不可调用
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/eager/function.py”,第2363行,在__
返回图形\函数。\过滤\调用(args,kwargs)\ pylint:disable=受保护的访问
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/eager/function.py”,第1611行,在_filtered_调用中
自捕获(U输入)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/eager/function.py”,第1692行,位于调用平面中
ctx,args,取消管理器=取消管理器)
调用中第545行的文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/eager/function.py”
ctx=ctx)
文件“/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/site packages/tensorflow_core/python/eager/execute.py”,第67行,在quick_execute中
六、将_从(核心状态)提升到_异常(例如代码、消息),无
文件“”,第3行,从
tensorflow.python.framework.errors\u impl.NotFoundError:找到2个根错误。
(0)未找到:容器localhost不存在。(找不到资源:localhost/\u AnonymousVar0)
[[node Adam/Adam/update/AssignSubVariableOp(在model.py:114中定义)]]
(1) 未找到:容器localhost不存在。(找不到资源:localhost/\u AnonymousVar0)
[[node Adam/Adam/update/AssignSubVariableOp(在model.py:114中定义)]]
[[GroupCrossDeviceControlledges_0/Adam/Adam/Const/_301]]
0成功的操作。
忽略0个派生错误。[Op:___推理_分布函数_15977]
错误可能源于输入操作。
连接到节点Adam/Adam/update/AssignSubVariableOp的输入源操作:
transformer/encoder/embedding/embedding_lookup/11773(定义于/home/atulu/anaconda3/envs/tf2 gpu/lib/python3.7/contextlib.py:112)
连接到节点Adam/Adam/update/AssignSubVariableOp的输入源操作:
变压器/编码器/嵌入/嵌入查找/11773(定义见
Train for 7290 steps
Epoch 1/15
2020-05-28 22:57:18.046206: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
7290/7290 [==============================] - 1986s 272ms/step - loss: 2.0052 - accuracy: 0.0939
Epoch 2/15
7290/7290 [==============================] - 1971s 270ms/step - loss: 1.6234 - accuracy: 0.1223
Epoch 3/15
7290/7290 [==============================] - 1968s 270ms/step - loss: 1.5535 - accuracy: 0.1291
Epoch 4/15
7290/7290 [==============================] - 1968s 270ms/step - loss: 1.5192 - accuracy: 0.1325
Epoch 5/15
7290/7290 [==============================] - 1968s 270ms/step - loss: 1.4978 - accuracy: 0.1348
Epoch 6/15
7290/7290 [==============================] - 1967s 270ms/step - loss: 1.4825 - accuracy: 0.1364
Epoch 7/15
7290/7290 [==============================] - 1967s 270ms/step - loss: 1.4711 - accuracy: 0.1376
Epoch 8/15
7290/7290 [==============================] - 1966s 270ms/step - loss: 1.4621 - accuracy: 0.1386
Epoch 9/15
 174/7290 [..............................] - ETA: 32:11 - loss: 1.4382 - accuracy: 0.13312020-05-29 03:20:43.528885: W tensorflow/core/framework/op_kernel.cc:1655] OP_REQUIRES failed at resource_variable_ops.cc:540 : Not found: Container localhost does not exist. (Could not find resource: localhost/_AnonymousVar0)
2020-05-29 03:20:43.528953: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Not found: Container localhost does not exist. (Could not find resource: localhost/_AnonymousVar0)
     [[{{node Adam/Adam/update/AssignSubVariableOp}}]]
     [[GroupCrossDeviceControlEdges_0/Adam/Adam/Const/_301]]
2020-05-29 03:20:43.529025: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Not found: Container localhost does not exist. (Could not find resource: localhost/_AnonymousVar0)
     [[{{node Adam/Adam/update/AssignSubVariableOp}}]]
 175/7290 [..............................] - ETA: 32:14 - loss: 1.4382 - accuracy: 0.1331Traceback (most recent call last):
  File "model.py", line 114, in <module>
    model.fit(dataset, epochs=EPOCHS)
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 819, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 342, in fit
    total_epochs=epochs)
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function
    distributed_function(input_fn))
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 568, in __call__
    result = self._call(*args, **kwds)
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/eager/def_function.py", line 599, in _call
    return self._stateless_fn(*args, **kwds)  # pylint: disable=not-callable
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 2363, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1611, in _filtered_call
    self.captured_inputs)
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 1692, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/eager/function.py", line 545, in call
    ctx=ctx)
  File "/home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.NotFoundError: 2 root error(s) found.
  (0) Not found:  Container localhost does not exist. (Could not find resource: localhost/_AnonymousVar0)
     [[node Adam/Adam/update/AssignSubVariableOp (defined at model.py:114) ]]
  (1) Not found:  Container localhost does not exist. (Could not find resource: localhost/_AnonymousVar0)
     [[node Adam/Adam/update/AssignSubVariableOp (defined at model.py:114) ]]
     [[GroupCrossDeviceControlEdges_0/Adam/Adam/Const/_301]]
0 successful operations.
0 derived errors ignored. [Op:__inference_distributed_function_15977]

Errors may have originated from an input operation.
Input Source operations connected to node Adam/Adam/update/AssignSubVariableOp:
 transformer/encoder/embedding/embedding_lookup/11773 (defined at /home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/contextlib.py:112)

Input Source operations connected to node Adam/Adam/update/AssignSubVariableOp:
 transformer/encoder/embedding/embedding_lookup/11773 (defined at /home/atulu/anaconda3/envs/tf2-gpu/lib/python3.7/contextlib.py:112)

Function call stack:
distributed_function -> distributed_function