Python Pyrotch分区模块，如何传播渐变？_Python_Gradient_Partitioning_Torch

Python Pyrotch分区模块，如何传播渐变？

python

Python Pyrotch分区模块，如何传播渐变？,python,gradient,partitioning,torch,Python,Gradient,Partitioning,Torch,我制作了以下PyTorch模块： class Partition(nn.Module): def __init__(self, decider, left_child, right_child): super().__init__() self.left_child = left_child self.right_child = right_child self.decider = decider def forw

我制作了以下PyTorch模块：

class Partition(nn.Module):
    def __init__(self, decider, left_child, right_child):
        super().__init__()
        self.left_child = left_child
        self.right_child = right_child
        self.decider = decider

    def forward(self, x):
        left_mask = self.decider(x)
        right_mask = ~left_mask
        o = x.new_empty(x.size())
        o[left_mask] = self.left_child(x[left_mask])
        o[right_mask] = self.right_child(x[right_mask])
        return o

它做了我想要它做的，但它不传播梯度。我如何以这样的方式实现它

我知道我可以这样做，在

left\u child

和

right\u child

上进行完整的

评估，然后与

m*l（x）+（1-m）*r（x）

组合，其中

m=float（decider（x））

，但对于性能，我不确定什么是“性能”您正在寻找，但掩蔽创建的副本要比张量上的操作昂贵得多。解决方案，

m*l（x）+（1-m）*r（x）

的性能应该很好，并且应该根据需要传播渐变。。。