Tensorflow 查找导致'的变量;FailedPremissionError:读取资源变量时出错';
我使用keras从零开始实现了基本的增强算法,以下是一个指南。我正在试验多代理设置,同时培训两个代理(发送者和接收者)。批处理Tensorflow 查找导致'的变量;FailedPremissionError:读取资源变量时出错';,tensorflow,keras,google-colaboratory,reinforcement-learning,multi-agent,Tensorflow,Keras,Google Colaboratory,Reinforcement Learning,Multi Agent,我使用keras从零开始实现了基本的增强算法,以下是一个指南。我正在试验多代理设置,同时培训两个代理(发送者和接收者)。批处理train\u的第一次调用对两个代理运行时没有任何问题。但是,当第二次调用批处理上的列时,它崩溃,抛出以下错误: FailedPreconditionError: Error while reading resource variable _AnonymousVar18 from Container: localhost. This could mean that t
train\u的第一次调用对两个代理运行时没有任何问题。但是,当第二次调用批处理上的列时,它崩溃,抛出以下错误:
FailedPreconditionError: Error while reading resource variable _AnonymousVar18 from Container:
localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar18/class
tensorflow::Var does not exist.
[[node mul_3/ReadVariableOp (defined at D:\ProgramData\Anaconda3\envs\jupyter-mess\lib\site-packages\keras\backend\tensorflow_backend.py:3014) ]]
[Op:__inference_keras_scratch_graph_1235]
Function call stack:
keras_scratch_graph
这是发送方模型,一个带有自定义损耗的简单前馈网络。接收器网络非常相似
class Sender:
def __init__(self, n_images, input_image_shape, embedding_size, vocabulary_size, temperature):
self.reset_memory()
image_inputs = [layers.Input(shape=input_image_shape, dtype="float32")
for i in range(n_images)]
image_embedding_layer = layers.Dense(embedding_size)
sigmoid = layers.Activation("sigmoid")
output_layer = layers.Dense(vocabulary_size)
temperature_layer = layers.Lambda(lambda x: x / temperature)
softmax = layers.Softmax()
y = [image_embedding_layer(x) for x in image_inputs]
y = [sigmoid(x) for x in y]
y = layers.concatenate(y, axis=-1)
y = output_layer(y)
y = temperature_layer(y)
y = softmax(y)
self.model = Model(image_inputs, y)
index = layers.Input(shape=[1], dtype="int32")
y_selected = layers.Lambda(
lambda probs_index: tf.gather(*probs_index, axis=-1),
)([y, index])
def loss(target, prediction):
return - K.log(prediction) * target
self.model_train = Model([*image_inputs, index], y_selected)
self.model_train.compile(loss=loss, optimizer=OPTIMIZER)
def predict(self, state):
return self.model.predict_on_batch(x=state)
def update(self, state, action, target):
x = [*state, action]
return self.model_train.train_on_batch(x=x, y=target)
以下是执行情况:
for episode in range(1, N_EPISODES + 1):
game.reset()
sender_state = game.get_sender_state(n_images=N_IMAGES, unique_categories=True, expand=True)
sender_probs = sender.predict(state=sender_state)
sender_probs = np.squeeze(sender_probs)
sender_action = np.random.choice(np.arange(len(sender_probs)), p=sender_probs)
receiver_state = game.get_receiver_state(sender_action, expand=True)
receiver_probs = receiver.predict(receiver_state)
receiver_probs = np.squeeze(receiver_probs)
receiver_action = np.random.choice(np.arange(len(receiver_probs)), p=receiver_probs)
sender_reward, receiver_reward, success = game.evaluate_guess(receiver_action)
sender.update(sender_state, np.asarray([sender_action]), np.asarray([sender_reward]))
receiver.update(receiver_state, np.asarray([receiver_action]), np.asarray([receiver_reward]))
当我使用keras2.3.1在jupyter笔记本上本地运行时,实现崩溃,但在使用keras2.4.3的Google Colab上运行良好
另外,当我禁用其中一个代理时,程序对另一个代理运行时不会出错,但我无法以这种方式训练任何东西
有没有办法确定是哪个变量导致了故障?我是否需要为每个代理运行单独的会话或类似的内容