Python PyTorch RuntimeError：尝试第二次向后遍历图形，但已释放保存的中间结果_Python_Deep Learning_Pytorch_Runtime Error_Convolution

Python PyTorch RuntimeError：尝试第二次向后遍历图形，但已释放保存的中间结果

python deep-learning pytorch

Python PyTorch RuntimeError：尝试第二次向后遍历图形，但已释放保存的中间结果,python,deep-learning,pytorch,runtime-error,convolution,Python,Deep Learning,Pytorch,Runtime Error,Convolution,我是一名大学生，刚刚开始深入学习，试图创建我的第一个反向传播模型但是，我不断得到“第二次尝试向后遍历图形，但是保存的中间结果已经被释放。”运行时错误我看到很多其他人在这里问同样的问题，但是他们的示例代码对我来说太高级了，我不理解答案。我已经尝试过添加convtraining.zero\u grad（）和loss.sum（）.backward（retain\u graph=True）。两者似乎都不起作用我自己的代码如下： # Import torch. import torch impor

我是一名大学生，刚刚开始深入学习，试图创建我的第一个反向传播模型

但是，我不断得到“第二次尝试向后遍历图形，但是保存的中间结果已经被释放。”运行时错误

我看到很多其他人在这里问同样的问题，但是他们的示例代码对我来说太高级了，我不理解答案。我已经尝试过添加

convtraining.zero\u grad（）

和

loss.sum（）.backward（retain\u graph=True）

。两者似乎都不起作用

我自己的代码如下：

# Import torch.
import torch
import torch.nn as nn

# Define a sample image.
image = torch.tensor([[1, 1, 0, 0, 0],
                     [0, 1, 1, 0, 0],
                     [0, 0, 1, 1, 0],
                     [0, 0, 0, 1, 1],
                     [1, 0, 0, 0, 1]], dtype = torch.float).reshape(1,1,5,5)

# Define a sample kernel.
kernel = torch.tensor([[0,-1, 0],
                       [-1, 1, -1],
                       [0,-1, 0]], dtype = torch.float).reshape(1,1,3,3)

# Define a convolution layer with the TRUE kernel weights assigned.
convolution = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=(3, 3), bias=False)
convolution.weight = nn.Parameter(kernel)
# Define another convolution without kernel weights to PREDICT them.
convtraining = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=(3, 3), bias=False)

# Run the first layer.
output_true = convolution(image)
# Reshape its output to make it suitable for comparison.
output_true = output_true.reshape((1, 1, 3, 3))

# Run the second model 5 times.
for i in range(5):
    output_prediction = convtraining(image)
    # Calculate the loss by squaring the error.
    loss = (output_prediction - output_true) ** 2
    convtraining.zero_grad()
    # Backward propagation.
    loss.sum().backward()
    # Adjust the kernel weights.
    convtraining.weight.data[:] -= 3e-2 * convtraining.weight.grad
    print(loss)

奇怪的是它以前工作过，我不知道发生了什么变化。有人知道可能出了什么问题吗？

您已经说过使用

retain\u graph=True

不起作用，但当我尝试使用您的确切代码时，只添加了

retain\u graph=True

，效果很好。我在下面添加for循环的内部

for i in range(5):
    output_prediction = convtraining(image)
    # Calculate the loss by squaring the error.
    loss = (output_prediction - output_true) ** 2
    convtraining.zero_grad()
    # Backward propagation.
    loss.sum().backward(retain_graph=True)

    # Adjust the kernel weights.
    convtraining.weight.data[:] -= 3e-2 * convtraining.weight.grad
    print(loss)

convtraining

的权重变为：

Parameter containing:
tensor([[[[-0.2317, -0.6439, -0.5469],
          [-0.6850,  0.1293, -0.4237],
          [-0.1236, -0.3529,  0.0662]]]], requires_grad=True)

我还逐行检查了调试模式下的代码。它在第二个历元中给出了报告的错误，但在添加

retain\u graph=True

后它就消失了。如果执行代码是整个问题，那么这似乎是可行的。

当您调用

backward（）

时，会进行多次连续计算（如派生）。这些计算的中间结果不会保存在内存中（它们会被删除）。只保留最终结果供您使用

不幸的是，运行另一个反向传播过程需要来自前一个反向传播过程的中间结果。为了确保在您想要运行另一个反向过程时保留这些所需的结果，您可以通过执行以下操作来使用

retain\u graph=True

：

loss.sum().backward(retain_graph=True)

而不是

loss.sum().backward()

假设你想做3次向后传球，你可以这样做：

loss.sum().backward(retain_graph=True)
loss.sum().backward(retain_graph=True)
loss.sum().backward()

在循环中添加

retain\u graph=True

并在末尾添加一个额外步骤而不添加

retain\u graph=True

后，代码正常工作