Python TensorFlow 2.0 tf.keras API急切模式与图形模式_Python_Tensorflow

Python TensorFlow 2.0 tf.keras API急切模式与图形模式

python tensorflow

Python TensorFlow 2.0 tf.keras API急切模式与图形模式,python,tensorflow,Python,Tensorflow,在TensorFlow中，据我所知，您的critic\u输出只是一个TensorFlow张量，因此您可以使用tf.math.reduce\u mean操作。它将在TensorFlow会话中工作，而不是命令式。即，这将返回要在TensorFlow会话中评估的操作将tensorflow导入为tf 将numpy作为np导入 inp=tf.placeholder（dtype=tf.float32）平均值=tf.数学.减少平均值（inp）使用tf.Session（）作为sess：打印（sess.r

在TensorFlow中，据我所知，您的

critic\u输出

只是一个TensorFlow张量，因此您可以使用

tf.math.reduce\u mean

操作。它将在TensorFlow会话中工作，而不是命令式。即，这将返回要在TensorFlow会话中评估的操作

将tensorflow导入为tf
将numpy作为np导入
inp=tf.placeholder（dtype=tf.float32）
平均值=tf.数学.减少平均值（inp）
使用tf.Session（）作为sess：
打印（sess.run（mean_op，feed_dict={inp:np.ones（10）}））
打印（sess.run（mean_op，feed_dict={inp:np.random.randn（10）}））

它将以如下方式进行评估：

1.0
-0.002577734

因此，首先，您的错误与以下事实有关：

optimizer.get_updates（）

是为图形模式设计的，因为它确实包含获取梯度张量所需的

K.gradients（）

，然后使用

K.function

将基于Keras优化器的更新应用于模型的可训练变量。其次，就急切模式或不健全而言，成本函数

loss=-tf.keras.backend.mean（critic\u output）

没有流。您应该摆脱图形模式代码，坚持本机2.0模式。根据您的代码，培训应如下所示：

def train_method(self, state_input):
  with tf.GradientTape() as tape:
    critic_output = self.critic([self.actor(state_input), state_input])
    loss=-tf.keras.backend.mean(critic_output)
  grads = tape.gradient(loss, params=self.actor.trainable_variables)
  # now please note that self.optimizer_actor must have apply_gradients 
  # so it should be tf.train.OptimizerName...
  self.optimizer_actor.apply_gradients(zip(grads, self.actor.trainable_variables))

这应该是github的问题。由于tf.keras的工作方式，我们需要从tf.gradients中删除此断言消息。你能提交github问题并抄送@alextp吗？

1.0
-0.002577734

def train_method(self, state_input):
  with tf.GradientTape() as tape:
    critic_output = self.critic([self.actor(state_input), state_input])
    loss=-tf.keras.backend.mean(critic_output)
  grads = tape.gradient(loss, params=self.actor.trainable_variables)
  # now please note that self.optimizer_actor must have apply_gradients 
  # so it should be tf.train.OptimizerName...
  self.optimizer_actor.apply_gradients(zip(grads, self.actor.trainable_variables))