Pytorch torch.mul使param.grad为非类型_Pytorch

Pytorch torch.mul使param.grad为非类型

pytorch

Pytorch torch.mul使param.grad为非类型,pytorch,Pytorch,我有一个简单的模型类，它有四个输入，每个输入都有自己的线性层。我希望输出是四个节点的乘积，但出于某种原因，无论我如何将它们相乘（使用torch.mul或*），梯度始终为非类型： class Net(torch.nn.Module): def __init__(self, D_u, D_i, D_t, D_m): super(Net, self).__init__() self.lin_u = nn.Linear(D_u, 1) self.l

我有一个简单的模型类，它有四个输入，每个输入都有自己的线性层。我希望输出是四个节点的乘积，但出于某种原因，无论我如何将它们相乘（使用torch.mul或*），梯度始终为非类型：

class Net(torch.nn.Module):
    def __init__(self, D_u, D_i, D_t, D_m):
        super(Net, self).__init__()
        self.lin_u = nn.Linear(D_u, 1)
        self.lin_i = nn.Linear(D_i, 1)
        self.lin_t = nn.Linear(D_t, 1)
        self.lin_m = nn.Linear(D_m, 1)
      
        self.output = nn.Linear(4, 1)

    def forward(self, args):
        (u, i, t, m) = args
        u = F.relu(self.lin_u(u))
        i = F.relu(self.lin_i(i))
        t = F.relu(self.lin_t(t))
        m = F.relu(self.lin_m(m))
        out = torch.mul(u, i)
        out = torch.mul(out, t)
        out = torch.mul(out, m)
        return out

我已经将输入设置为requires_grad=True，我认为问题在于out不是叶子，因此没有梯度，但我不知道如何解决这个问题

编辑：

数据u_块、i_块、t_块、m_块、y_块如下所示。u_块、i_块和t_块是一个热向量

TypeError                                 
--->   param -= learning_rate * param.grad

TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'

进行以下更改。您没有使用

self.output

，因此我已经发表了评论。这将使渐变为“无”，因为您在正向过程中不使用渐变，并且默认情况下该层需要_grad=True

u_block:  tensor([[1., 0., 0.,  ..., 0., 0., 0.],
        [1., 0., 0.,  ..., 0., 0., 0.],
        [1., 0., 0.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 0., 0., 1.],
        [0., 0., 0.,  ..., 0., 0., 1.],
        [0., 0., 0.,  ..., 0., 0., 1.]], requires_grad=True)
i_block:  tensor([[1., 0., 0.],
        [1., 0., 0.],
        [1., 0., 0.],
        ...,
        [0., 1., 0.],
        [0., 1., 0.],
        [0., 1., 0.]], requires_grad=True)
t_block:  tensor([[1., 0., 0.,  ..., 0., 0., 0.],
        [0., 1., 0.,  ..., 0., 0., 0.],
        [0., 0., 1.,  ..., 0., 0., 0.],
        ...,
        [0., 0., 0.,  ..., 1., 0., 0.],
        [0., 0., 0.,  ..., 0., 1., 0.],
        [0., 0., 0.,  ..., 0., 0., 1.]], requires_grad=True)
m_block:  tensor([[ 0.0335],
        [ 0.0000],
        [ 0.0000],
        ...,
        [ 0.1515],
        [-0.2261],
        [-0.0402]], requires_grad=True)
y_block:  tensor([[ 0.0000],
        [ 0.0000],
        [ 0.0000],
        ...,
        [-0.2261],
        [-0.0402],
        [-0.1318]], requires_grad=True)```

我希望这能解决你的问题

还有，我有一些建议

将名称args更改为其他名称，或者如果您想使用它，则通过更改为*args来充分利用它

对于输入，不放置需要\u grad参数。因为它将计算d_损失/d_输入。（只有在您无意的情况下才这样做）