Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/362.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 如何计算Tensorflow中的所有二阶导数(仅Hessian矩阵的对角线)?_Python_Tensorflow - Fatal编程技术网

Python 如何计算Tensorflow中的所有二阶导数(仅Hessian矩阵的对角线)?

Python 如何计算Tensorflow中的所有二阶导数(仅Hessian矩阵的对角线)?,python,tensorflow,Python,Tensorflow,我有一个损失值/函数,我想计算关于张量f(大小为n)的所有二阶导数。我成功地使用了两次tf.gradients,但当第二次应用它时,它会对第一个输入的导数求和(请参阅代码中的第二个_导数) 我还设法检索了Hessian矩阵,但我只想计算它的对角线,以避免额外的计算 import tensorflow as tf import numpy as np f = tf.Variable(np.array([[1., 2., 0]]).T) loss = tf.reduce_prod(f ** 2 -

我有一个损失值/函数,我想计算关于张量f(大小为n)的所有二阶导数。我成功地使用了两次tf.gradients,但当第二次应用它时,它会对第一个输入的导数求和(请参阅代码中的第二个_导数)

我还设法检索了Hessian矩阵,但我只想计算它的对角线,以避免额外的计算

import tensorflow as tf
import numpy as np

f = tf.Variable(np.array([[1., 2., 0]]).T)
loss = tf.reduce_prod(f ** 2 - 3 * f + 1)

first_derivatives = tf.gradients(loss, f)[0]

second_derivatives = tf.gradients(first_derivatives, f)[0]

hessian = [tf.gradients(first_derivatives[i,0], f)[0][:,0] for i in range(3)]

model = tf.initialize_all_variables()
with tf.Session() as sess:
    sess.run(model)
    print "\nloss\n", sess.run(loss)
    print "\nloss'\n", sess.run(first_derivatives)
    print "\nloss''\n", sess.run(second_derivatives)
    hessian_value = np.array(map(list, sess.run(hessian)))
    print "\nHessian\n", hessian_value
我的想法是,tf.梯度(一阶导数,f[0,0])[0]可以用来检索,例如关于f\u 0的二阶导数,但tensorflow似乎不允许从张量的切片中导出。

tf.梯度([f1,f2,f3],…)计算
f=f1+f2+f3的梯度
另外,区分
x[0]
是有问题的,因为
x[0]
指的是一个新的
Slice
节点,它不是您的损失的祖先,因此相对于它的派生将是
None
。您可以使用
pack
x[0]、x[1]、…
粘合到
xx
中,然后让您的损失取决于
xx
而不是
x
。另一种方法是对单个组件使用单独的变量,在这种情况下,计算Hessian将类似这样

def replace_none_with_zero(l):
  return [0 if i==None else i for i in l] 

tf.reset_default_graph()

x = tf.Variable(1.)
y = tf.Variable(1.)
loss = tf.square(x) + tf.square(y)
grads = tf.gradients([loss], [x, y])
hess0 = replace_none_with_zero(tf.gradients([grads[0]], [x, y]))
hess1 = replace_none_with_zero(tf.gradients([grads[1]], [x, y]))
hessian = tf.pack([tf.pack(hess0), tf.pack(hess1)])
sess = tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
print hessian.eval()
你会看到的

[[ 2.  0.]
 [ 0.  2.]]

现在考虑TH.Hessian,

tf.hessians(loss, f)

以下函数计算Tensorflow 2.0中的二阶导数(Hessian矩阵的对角线):

%tensorflow_version 2.x  # Tells Colab to load TF 2.x
import tensorflow as tf

def calc_hessian_diag(f, x):
    """
    Calculates the diagonal entries of the Hessian of the function f
    (which maps rank-1 tensors to scalars) at coordinates x (rank-1
    tensors).
    
    Let k be the number of points in x, and n be the dimensionality of
    each point. For each point k, the function returns

      (d^2f/dx_1^2, d^2f/dx_2^2, ..., d^2f/dx_n^2) .

    Inputs:
      f (function): Takes a shape-(k,n) tensor and outputs a
          shape-(k,) tensor.
      x (tf.Tensor): The points at which to evaluate the Laplacian
          of f. Shape = (k,n).
    
    Outputs:
      A tensor containing the diagonal entries of the Hessian of f at
      points x. Shape = (k,n).
    """
    # Use the unstacking and re-stacking trick, which comes
    # from https://github.com/xuzhiqin1990/laplacian/
    with tf.GradientTape(persistent=True) as g1:
        # Turn x into a list of n tensors of shape (k,)
        x_unstacked = tf.unstack(x, axis=1)
        g1.watch(x_unstacked)

        with tf.GradientTape() as g2:
            # Re-stack x before passing it into f
            x_stacked = tf.stack(x_unstacked, axis=1) # shape = (k,n)
            g2.watch(x_stacked)
            f_x = f(x_stacked) # shape = (k,)
        
        # Calculate gradient of f with respect to x
        df_dx = g2.gradient(f_x, x_stacked) # shape = (k,n)
        # Turn df/dx into a list of n tensors of shape (k,)
        df_dx_unstacked = tf.unstack(df_dx, axis=1)

    # Calculate 2nd derivatives
    d2f_dx2 = []
    for df_dxi,xi in zip(df_dx_unstacked, x_unstacked):
        # Take 2nd derivative of each dimension separately:
        #   d/dx_i (df/dx_i)
        d2f_dx2.append(g1.gradient(df_dxi, xi))
    
    # Stack 2nd derivates
    d2f_dx2_stacked = tf.stack(d2f_dx2, axis=1) # shape = (k,n)
    
    return d2f_dx2_stacked
下面是一个示例用法,函数
f(x)=ln(r)
,其中
x
是三维坐标,
r
是半径,是球坐标:

f = lambda q : tf.math.log(tf.math.reduce_sum(q**2, axis=1))
x = tf.random.uniform((5,3))

d2f_dx2 = calc_hessian_diag(f, x)
print(d2f_dx2)
系统将如下所示:

tf.Tensor(
[[ 1.415968    1.0215727  -0.25363517]
 [-0.67299247  2.4847088   0.70901346]
 [ 1.9416015  -1.1799507   1.3937857 ]
 [ 1.4748447   0.59702784 -0.52290654]
 [ 1.1786096   0.07442689  0.2396735 ]], shape=(5, 3), dtype=float32)
我们可以通过计算拉普拉斯函数(即,通过求Hessian矩阵的对角线的和)并与所选函数的理论答案进行比较来检查实现的正确性,
2/r^2

print(tf.reduce_sum(d2f_dx2, axis=1)) # Laplacian from summing above results
print(2./tf.math.reduce_sum(x**2, axis=1)) # Analytic expression for Lapalcian
我得到以下信息:

tf.Tensor([2.1839054 2.5207298 2.1554365 1.5489659 1.49271  ], shape=(5,), dtype=float32)
tf.Tensor([2.1839058 2.5207298 2.1554365 1.5489662 1.4927098], shape=(5,), dtype=float32)

他们同意在舍入误差范围内。

请接受您的答案,不幸的是,这没有真正的帮助,因为我只想检索Hessian的对角线。我尝试将pack与x[0],x[1]一起使用。。。但它仍然给了我一个错误。
hess0=tf.gradients([grads[0]],[x]);hess1=tf.gradients([grads[1]],[y])
只会计算对角线的中心距,我终于用你上次回复的和tf.pack()完成了!注意,在Tensorflow 1及以下版本中,tf.pack()被Array.pack替换