Python Pytorch中Sigmoid函数破坏的梯度计算_Python_Deep Learning_Pytorch_Cnn

Python Pytorch中Sigmoid函数破坏的梯度计算

python deep-learning pytorch

Python Pytorch中Sigmoid函数破坏的梯度计算,python,deep-learning,pytorch,cnn,Python,Deep Learning,Pytorch,Cnn,嘿，我一直在努力解决这个奇怪的问题。以下是我的神经网络代码： class Net(nn.Module): def __init__(self): super(Net, self).__init__() self.conv_3d_=nn.Sequential( nn.Conv3d(1,1,9,1,4), nn.LeakyReLU(), nn.Conv3d(1,1,9,1,4),

嘿，我一直在努力解决这个奇怪的问题。以下是我的神经网络代码：

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv_3d_=nn.Sequential(
            nn.Conv3d(1,1,9,1,4),
            nn.LeakyReLU(),
            nn.Conv3d(1,1,9,1,4),
            nn.LeakyReLU(),
            nn.Conv3d(1,1,9,1,4),
            nn.LeakyReLU()  
        )

        self.linear_layers_ = nn.Sequential(

            nn.Linear(batch_size*32*32*32,batch_size*32*32*3),
            nn.LeakyReLU(),
            nn.Linear(batch_size*32*32*3,batch_size*32*32*3),
            nn.Sigmoid()
        )

    def forward(self,x,y,z):
        conv_layer = x + y + z
        conv_layer = self.conv_3d_(conv_layer)
        conv_layer = torch.flatten(conv_layer)
        conv_layer = self.linear_layers_(conv_layer)
        conv_layer = conv_layer.view((batch_size,3,input_sizes,input_sizes))
        return conv_layer

我面临的奇怪问题是运行这个NN会给我一个错误

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [3072]], which is output 0 of SigmoidBackward, is at version 1; expected version 0 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!

堆栈跟踪显示问题是一致的

conv_layer = self.linear_layers_(conv_layer)

但是，如果我将FCN的最后一个激活函数从nn.Sigmoid（）替换为nn.LeakyRelu（），则nn将正确执行

有人能告诉我为什么Sigmoid激活函数会导致我的反向计算中断吗？

我发现我的代码有问题。我深入探究了真正的含义。所以，如果你检查一下线路

conv_layer = self.linear_layers_(conv_layer)

赋值的线性层改变了conv层的值，结果这些值被覆盖，因此梯度计算失败。这个问题的简单解决方法是使用clone（）函数

i、 e

这将创建右侧计算的副本，Autograd能够存储计算图的引用

如果在CPU上执行，是否也会出现错误？另外，你能为一个包含较小样本（和缩小的内核）的系统重现这个问题吗？不，我没有在cpu上尝试过。我将尝试制作一个最小的可重复示例@Dennilli，我认为问题不在

forward（）

函数范围内，很可能与

x，y，z

如何在训练步骤之间声明和重用有关。错误消息基本上是说其中一个变量被意外修改。所以我认为这在第一个转发呼叫上有效，但在下一个转发呼叫上无效。

conv_layer = self.linear_layers_(conv_layer).clone()