Pytorch softmax之后Pytork权重未更新_Pytorch_Softmax_Weighted Average

Pytorch softmax之后Pytork权重未更新

pytorch

Pytorch softmax之后Pytork权重未更新,pytorch,softmax,weighted-average,Pytorch,Softmax,Weighted Average,我正在使用PyTorch来执行一个优化问题，这是找到一组权重w，这样x（sum（w*x）/sum（w））的加权平均值可以用来估计一些变量，比如y 下面是我的pytorch的“模型” dtype = torch.float device = torch.device('cpu') class WAvg(nn.Module): def __init__(self, p): super(WAvg, self).__init__() self.p = p

我正在使用

PyTorch

来执行一个优化问题，这是找到一组权重

，这样

（

sum（w*x）/sum（w）

）的加权平均值可以用来估计一些变量，比如

下面是我的

pytorch

的“模型”

dtype = torch.float
device = torch.device('cpu')

class WAvg(nn.Module):
    def __init__(self, p):
        super(WAvg, self).__init__()
        self.p = p
        self.q = nn.Parameter(torch.randn(self.p, 1, device=device, dtype=dtype))
        self.w = nn.functional.softmax(self.q, dim=0)
    def forward(self, x):
        w_avg = nn.functional.linear(x, self.w.T)
        return w_avg

培训守则

x_tr = np.array([
    [1, 1, 1],
    [1, 4, 1],
    [2, 4, 6], 
    [1, 2, 3], 
    [4, 2, -3], 
    [2, 2, 2] 
])
y_tr = np.array([1, 2.1, 3.9, 2, 1.2, 1.8])

x_tr = torch.from_numpy(x_tr).float()
y_tr = torch.from_numpy(y_tr).float()


wa = WAvg(3)

criterion = nn.MSELoss()
optimizer = optim.Adam(wa.parameters(), lr=0.01)

for epoch in range(10):
    # Set running loss
    running_loss_tr = 0.0
    # zero the parameter gradients
    optimizer.zero_grad()
    # forward + backward + optimize
    y_pred_tr = wa(x_tr)
    loss_tr = criterion(y_pred_tr, y_tr)
    loss_tr.backward()
    optimizer.step()
    # print statistics
    print(epoch, loss_tr.item())

这将得到一个错误

RuntimeError：第二次尝试向后遍历图形，但缓冲区已被释放。第一次向后调用时指定retain_graph=True

参数

retain\u graph=True

添加在

loss\u tr.backward（）

中（如本文所建议），但参数

或

似乎未更新。我认为问题应该是由

softmax

引起的，它对权重进行了限制，使其总和为1，是否有任何线索可以修复

输出：

0 1.305460810661316
1 1.305460810661316
2 1.305460810661316
3 1.305460810661316
4 1.305460810661316
5 1.305460810661316
6 1.305460810661316
7 1.305460810661316
8 1.305460810661316
9 1.305460810661316

“妈”。参数（）…你是说wa吗

这就是问题所在

这解释了为什么您没有更新正确的参数，并且由于您没有将正确的渐变归零，这表明您在同一个图形中重复传播了两次。

感谢您发现我的输入错误，但是在将

ma

更改为

wa

后，参数仍然没有更新。在初始化期间，您正在计算

。它不是叶节点，因此不会得到渐变，也不会被更新。网络中唯一的叶节点是

，但是更改

不会对

产生任何影响，因为在转发过程中不会重新计算

。一件应该有效的事情是将

self.w=

行向前移动到

中的第一行。
optimizer = optim.Adam(ma.parameters(), lr=0.01)