Pytorch 为什么autograd不为中间变量生成渐变？_Pytorch_Autograd

Pytorch 为什么autograd不为中间变量生成渐变？

pytorch

Pytorch 为什么autograd不为中间变量生成渐变？,pytorch,autograd,Pytorch,Autograd,试图让我了解渐变是如何表示的以及autograd是如何工作的： import torch from torch.autograd import Variable x = Variable(torch.Tensor([2]), requires_grad=True) y = x * x z = y * y z.backward() print(x.grad) #Variable containing: #32 #[torch.FloatTensor of size 1] print(y.g

试图让我了解渐变是如何表示的以及autograd是如何工作的：

import torch
from torch.autograd import Variable

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

z.backward()

print(x.grad)
#Variable containing:
#32
#[torch.FloatTensor of size 1]

print(y.grad)
#None

为什么它不为

生成渐变？如果

y.grad=dz/dy

，那么它不应该至少产生一个类似

y.grad=2*y

的变量吗

默认情况下，仅叶变量保留渐变。非叶变量的梯度不会保留以供以后检查。这是设计完成，节省内存。

-苏米特·钦塔拉

见：

备选案文1：调用

y.retain\u grad（）

资料来源：

备选案文2：注册一个

钩子

，它基本上是在计算梯度时调用的函数。然后你可以保存它，分配它，打印它，无论什么

from __future__ import print_function
import torch
from torch.autograd import Variable

x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y

y.register_hook(print) ## this can be anything you need it to be

z.backward()

输出：

Variable containing:  8 [torch.FloatTensor of size 1

资料来源：

另请参见：

我认为这是一个有趣的问题，可以发布在感谢上，因为我不知道retain\u grad（）方法

Variable containing:  8 [torch.FloatTensor of size 1