Machine learning 停止反向传播？_Machine Learning_Deep Learning_Neural Network_Pytorch_Backpropagation

Machine learning 停止反向传播？

machine-learning deep-learning neural-network pytorch

Machine learning 停止反向传播？,machine-learning,deep-learning,neural-network,pytorch,backpropagation,Machine Learning,Deep Learning,Neural Network,Pytorch,Backpropagation,我需要创建一个神经网络，在那里我使用二进制门来调零某些张量，这些张量是禁用电路的输出为了提高运行速度，我期待着使用torch.bool二进制门来阻止网络中禁用电路的反向传播。然而，我使用官方的CIFAR-10数据集的PyTorch示例创建了一个小实验，对于gate_a和gate_B的任何值，运行速度都是完全相同的（这意味着这个想法不起作用）如何定义gate_A和gate_B，使反向传播在它们为零时有效停止 PS.在运行时动态更改连接，也会更改分配给每个模块的权重。（例如，与a相关的权重可以在

我需要创建一个神经网络，在那里我使用二进制门来调零某些张量，这些张量是禁用电路的输出

为了提高运行速度，我期待着使用

torch.bool

二进制门来阻止网络中禁用电路的反向传播。然而，我使用官方的CIFAR-10数据集的

PyTorch

示例创建了一个小实验，对于

gate_a

和

gate_B

的任何值，运行速度都是完全相同的（这意味着这个想法不起作用）

如何定义

gate_A

和

gate_B

，使反向传播在它们为零时有效停止

PS.在运行时动态更改

连接

，也会更改分配给每个模块的权重。（例如，与

相关的权重可以在另一个过程中分配给

，从而中断网络的运行方式）。

您可以使用（下面的代码可能更简洁）：

再看一看，我认为以下是解决具体问题的简单方法：

def forward(self, x):
        # Only one gate is supposed to be enabled at random
        # However, for the experiment, I fixed the values to [1,0] and [1,1]
        choice  =  randint(0,1)

        if choice:
            a = self.pool(F.relu(self.conv1a(x)))
            a = self.pool(F.relu(self.conv2a(a)))
            b = torch.zeros(shape_of_conv_output) # replace shape of conv output here
        else:
            b = self.pool(F.relu(self.conv1b(x)))
            b = self.pool(F.relu(self.conv2b(b)))
            a = torch.zeros(shape_of_conv_output) # replace shape of conv output here
       
        x  = torch.cat( [a,b], dim = 1 )
        
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

简单的解决方案是，在禁用

或

时，只需定义一个带零的张量：）

另外，我在喝咖啡的时候想到了这一点。

像你的例子中那样，

总是意味着启用而

总是意味着禁用吗？如果不是，代码的哪一部分决定了这一点？不，它们实际上应该是随机变化的：）那么，你有两个门，并且随机地，只有一个被启用了？是的，它是正确的。这项技术对于神经结构搜索是必不可少的。如果不以某种方式停止沿禁用门的反向传播，运行时间可能会成倍增加。谢谢，这似乎有效：）@C-3PO再看一眼，我不确定在您的特定情况下是否有必要这样做。为什么你不能跳过被禁用门的向前传球计算呢？在选择的条件下，你可以向前传递一半，用零连接另一半，然后像以前一样继续。（我想你可以，但我问你是因为你在问题中的PS）。是的，我也有同样的想法。事实上，我在喝咖啡的时候发明了下面帖子中的代码。谢谢你的反馈。@C-3PO不错，结果都一样：）

def forward(self, x):
        # Only one gate is supposed to be enabled at random
        # However, for the experiment, I fixed the values to [1,0] and [1,1]
        choice  =  randint(0,1)
        gate_A  =  torch.tensor(choice   ,dtype = torch.bool) 
        gate_B  =  torch.tensor(1-choice ,dtype = torch.bool) 
        
        if choice:
            a = self.pool(F.relu(self.conv1a(x)))
            a = self.pool(F.relu(self.conv2a(a)))
            a *= gate_A
            
            with torch.no_grad(): # disable gradient computation
                b = self.pool(F.relu(self.conv1b(x)))
                b = self.pool(F.relu(self.conv2b(b)))
                b *= gate_B
        else:
            with torch.no_grad(): # disable gradient computation
                a = self.pool(F.relu(self.conv1a(x)))
                a = self.pool(F.relu(self.conv2a(a)))
                a *= gate_A
            
            b = self.pool(F.relu(self.conv1b(x)))
            b = self.pool(F.relu(self.conv2b(b)))
            b *= gate_B

        x  = torch.cat( [a,b], dim = 1 )
        
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

def forward(self, x):
        # Only one gate is supposed to be enabled at random
        # However, for the experiment, I fixed the values to [1,0] and [1,1]
        choice  =  randint(0,1)

        if choice:
            a = self.pool(F.relu(self.conv1a(x)))
            a = self.pool(F.relu(self.conv2a(a)))
            b = torch.zeros(shape_of_conv_output) # replace shape of conv output here
        else:
            b = self.pool(F.relu(self.conv1b(x)))
            b = self.pool(F.relu(self.conv2b(b)))
            a = torch.zeros(shape_of_conv_output) # replace shape of conv output here
       
        x  = torch.cat( [a,b], dim = 1 )
        
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.pool = nn.MaxPool2d(2, 2)
        self.conv1a = nn.Conv2d(3, 6, 5)
        self.conv2a = nn.Conv2d(6, 16, 5)
        self.conv1b = nn.Conv2d(3, 6, 5)
        self.conv2b = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(32 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        
    def forward(self, x):
        
        if randint(0,1):
            a = self.pool(F.relu(self.conv1a(x)))
            a = self.pool(F.relu(self.conv2a(a)))
            b = torch.zeros_like(a)
        else:
            b = self.pool(F.relu(self.conv1b(x)))
            b = self.pool(F.relu(self.conv2b(b)))
            a = torch.zeros_like(b)
        
        x  = torch.cat( [a,b], dim = 1 )
        
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x