Tensorflow 用张量流梯度带计算Hessian

Tensorflow 用张量流梯度带计算Hessian,tensorflow,autodiff,Tensorflow,Autodiff,感谢您对这个问题的关注 我想计算tensorflow.keras.Model的hessian矩阵 对于高阶导数,我尝试了嵌套GradientTape.#示例图形和输入 xs = tf.constant(tf.random.normal([100,24])) ex_model = Sequential() ex_model.add(Input(shape=(24))) ex_model.add(Dense(10)) ex_model.add(Dense(1)) with tf.Gradient

感谢您对这个问题的关注

我想计算tensorflow.keras.Model的hessian矩阵

对于高阶导数,我尝试了嵌套GradientTape.#示例图形和输入

xs = tf.constant(tf.random.normal([100,24]))

ex_model = Sequential()
ex_model.add(Input(shape=(24)))
ex_model.add(Dense(10))
ex_model.add(Dense(1))

with tf.GradientTape(persistent=True) as tape:
    tape.watch(xs)
    ys = ex_model(xs)
g = tape.gradient(ys, xs)
h = tape.jacobian(g, xs)
print(g.shape)
print(h.shape)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-20-dbf443f1ddab> in <module>
      5 h = tape.jacobian(g, xs)
      6 print(g.shape)
----> 7 print(h.shape)

AttributeError: 'NoneType' object has no attribute 'shape'
xs=tf.constant(tf.random.normal([100,24]))
ex_模型=顺序()
ex_model.add(输入(shape=(24)))
ex_模型添加(密度(10))
ex_模型添加(密集(1))
使用tf.GradientTape(persistent=True)作为磁带:
磁带.手表(xs)
ys=ex_模型(xs)
g=磁带梯度(ys,xs)
h=磁带雅可比矩阵(g,xs)
印刷品(g.shape)
打印(h形)
---------------------------------------------------------------------------
AttributeError回溯(最近一次呼叫上次)
在里面
5h=磁带雅可比(g,xs)
6打印(g形)
---->7打印(h形)
AttributeError:“非类型”对象没有属性“形状”
还有,另一次审判

with tf.GradientTape() as tape1:
    with tf.GradientTape() as tape2:
        tape2.watch(xs)
        ys = ex_model(xs)
    g = tape2.gradient(ys, xs)
h = tape1.jacobian(g, xs)
    
print(g.shape)
print(h.shape)


(100, 24)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-17-c5bbb17404bc> in <module>
      7 
      8 print(g.shape)
----> 9 print(h.shape)

AttributeError: 'NoneType' object has no attribute 'shape'
将tf.GradientTape()作为tape1:
使用tf.GradientTape()作为tape2:
磁带2.手表(xs)
ys=ex_模型(xs)
g=tape2.梯度(ys,xs)
h=tape1.雅可比(g,xs)
印刷品(g.shape)
打印(h形)
(100, 24)
---------------------------------------------------------------------------
AttributeError回溯(最近一次呼叫上次)
在里面
7.
8打印(g形)
---->9打印(h形)
AttributeError:“非类型”对象没有属性“形状”

为什么我不能计算梯度g wrt x?

你已经计算了梯度的二阶wrt
xs
,它是零,当你计算梯度wrt常数时应该是零,这就是为什么
tape1.jacobian(g,xs)
返回
None

当二阶梯度wrt常数时:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense

x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
  with tf.GradientTape() as t1:
    y = w * x**3
  dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)

print('dy_dx:', dy_dx) # 3 * 3 * x**2 => 9.0
print('d2y_dx2:', d2y_dx2) # 9 * 2 * x => 18.0
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense

x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
  with tf.GradientTape() as t1:
    y = w * x
  dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)

print('dy_dx:', dy_dx)
print('d2y_dx2:', d2y_dx2)
产出:

dy_dx: tf.Tensor(9.0, shape=(), dtype=float32)
d2y_dx2: tf.Tensor(18.0, shape=(), dtype=float32)
dy_dx: tf.Tensor(3.0, shape=(), dtype=float32)
d2y_dx2: None
当二阶梯度wrt常数时:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense

x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
  with tf.GradientTape() as t1:
    y = w * x**3
  dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)

print('dy_dx:', dy_dx) # 3 * 3 * x**2 => 9.0
print('d2y_dx2:', d2y_dx2) # 9 * 2 * x => 18.0
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense

x = tf.Variable(1.0)
w = tf.constant(3.0)
with tf.GradientTape() as t2:
  with tf.GradientTape() as t1:
    y = w * x
  dy_dx = t1.gradient(y, x)
d2y_dx2 = t2.gradient(dy_dx, x)

print('dy_dx:', dy_dx)
print('d2y_dx2:', d2y_dx2)
产出:

dy_dx: tf.Tensor(9.0, shape=(), dtype=float32)
d2y_dx2: tf.Tensor(18.0, shape=(), dtype=float32)
dy_dx: tf.Tensor(3.0, shape=(), dtype=float32)
d2y_dx2: None

但是,您可以计算二阶梯度wrt
xs
的图层参数,例如

谢谢您的回复。在我向模型中添加了一个x**2的lambda层之后,嵌套的磁带返回张量。感谢you@MyPrunus很高兴为您提供帮助,如果我的答案解决了您的问题,请将其标记为答案,以便其他有类似问题的人知道在哪里可以找到答案,祝您愉快:)