Machine learning 停止反向传播?
我需要创建一个神经网络,在那里我使用二进制门来调零某些张量,这些张量是禁用电路的输出 为了提高运行速度,我期待着使用Machine learning 停止反向传播?,machine-learning,deep-learning,neural-network,pytorch,backpropagation,Machine Learning,Deep Learning,Neural Network,Pytorch,Backpropagation,我需要创建一个神经网络,在那里我使用二进制门来调零某些张量,这些张量是禁用电路的输出 为了提高运行速度,我期待着使用torch.bool二进制门来阻止网络中禁用电路的反向传播。然而,我使用官方的CIFAR-10数据集的PyTorch示例创建了一个小实验,对于gate_a和gate_B的任何值,运行速度都是完全相同的(这意味着这个想法不起作用) 如何定义gate_A和gate_B,使反向传播在它们为零时有效停止 PS.在运行时动态更改连接,也会更改分配给每个模块的权重。(例如,与a相关的权重可以在
torch.bool
二进制门来阻止网络中禁用电路的反向传播。然而,我使用官方的CIFAR-10数据集的PyTorch
示例创建了一个小实验,对于gate_a
和gate_B
的任何值,运行速度都是完全相同的(这意味着这个想法不起作用)
如何定义gate_A
和gate_B
,使反向传播在它们为零时有效停止
PS.在运行时动态更改连接
,也会更改分配给每个模块的权重。(例如,与a
相关的权重可以在另一个过程中分配给b
,从而中断网络的运行方式)。您可以使用(下面的代码可能更简洁):
再看一看,我认为以下是解决具体问题的简单方法:
def forward(self, x):
# Only one gate is supposed to be enabled at random
# However, for the experiment, I fixed the values to [1,0] and [1,1]
choice = randint(0,1)
if choice:
a = self.pool(F.relu(self.conv1a(x)))
a = self.pool(F.relu(self.conv2a(a)))
b = torch.zeros(shape_of_conv_output) # replace shape of conv output here
else:
b = self.pool(F.relu(self.conv1b(x)))
b = self.pool(F.relu(self.conv2b(b)))
a = torch.zeros(shape_of_conv_output) # replace shape of conv output here
x = torch.cat( [a,b], dim = 1 )
x = torch.flatten(x, 1) # flatten all dimensions except batch
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
简单的解决方案是,在禁用
a
或b
时,只需定义一个带零的张量:)
另外,我在喝咖啡的时候想到了这一点。像你的例子中那样,
a
总是意味着启用而b
总是意味着禁用吗?如果不是,代码的哪一部分决定了这一点?不,它们实际上应该是随机变化的:)那么,你有两个门,并且随机地,只有一个被启用了?是的,它是正确的。这项技术对于神经结构搜索是必不可少的。如果不以某种方式停止沿禁用门的反向传播,运行时间可能会成倍增加。谢谢,这似乎有效:)@C-3PO再看一眼,我不确定在您的特定情况下是否有必要这样做。为什么你不能跳过被禁用门的向前传球计算呢?在选择的条件下,你可以向前传递一半,用零连接另一半,然后像以前一样继续。(我想你可以,但我问你是因为你在问题中的PS)。是的,我也有同样的想法。事实上,我在喝咖啡的时候发明了下面帖子中的代码。谢谢你的反馈。@C-3PO不错,结果都一样:)
def forward(self, x):
# Only one gate is supposed to be enabled at random
# However, for the experiment, I fixed the values to [1,0] and [1,1]
choice = randint(0,1)
gate_A = torch.tensor(choice ,dtype = torch.bool)
gate_B = torch.tensor(1-choice ,dtype = torch.bool)
if choice:
a = self.pool(F.relu(self.conv1a(x)))
a = self.pool(F.relu(self.conv2a(a)))
a *= gate_A
with torch.no_grad(): # disable gradient computation
b = self.pool(F.relu(self.conv1b(x)))
b = self.pool(F.relu(self.conv2b(b)))
b *= gate_B
else:
with torch.no_grad(): # disable gradient computation
a = self.pool(F.relu(self.conv1a(x)))
a = self.pool(F.relu(self.conv2a(a)))
a *= gate_A
b = self.pool(F.relu(self.conv1b(x)))
b = self.pool(F.relu(self.conv2b(b)))
b *= gate_B
x = torch.cat( [a,b], dim = 1 )
x = torch.flatten(x, 1) # flatten all dimensions except batch
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def forward(self, x):
# Only one gate is supposed to be enabled at random
# However, for the experiment, I fixed the values to [1,0] and [1,1]
choice = randint(0,1)
if choice:
a = self.pool(F.relu(self.conv1a(x)))
a = self.pool(F.relu(self.conv2a(a)))
b = torch.zeros(shape_of_conv_output) # replace shape of conv output here
else:
b = self.pool(F.relu(self.conv1b(x)))
b = self.pool(F.relu(self.conv2b(b)))
a = torch.zeros(shape_of_conv_output) # replace shape of conv output here
x = torch.cat( [a,b], dim = 1 )
x = torch.flatten(x, 1) # flatten all dimensions except batch
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
class Net(nn.Module):
def __init__(self):
super().__init__()
self.pool = nn.MaxPool2d(2, 2)
self.conv1a = nn.Conv2d(3, 6, 5)
self.conv2a = nn.Conv2d(6, 16, 5)
self.conv1b = nn.Conv2d(3, 6, 5)
self.conv2b = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(32 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
if randint(0,1):
a = self.pool(F.relu(self.conv1a(x)))
a = self.pool(F.relu(self.conv2a(a)))
b = torch.zeros_like(a)
else:
b = self.pool(F.relu(self.conv1b(x)))
b = self.pool(F.relu(self.conv2b(b)))
a = torch.zeros_like(b)
x = torch.cat( [a,b], dim = 1 )
x = torch.flatten(x, 1) # flatten all dimensions except batch
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x