Python 使用BCE进行优化不起作用,不会有任何改变
我有以下代码:Python 使用BCE进行优化不起作用,不会有任何改变,python,pytorch,Python,Pytorch,我有以下代码: import torch import torch.nn as nn import torch.nn.functional as F from tqdm import tqdm import matplotlib.pyplot as plt import os import keras from random import choice import sys devicet = 'cuda' if torch.cuda.is_available() else 'cpu' devi
import torch
import torch.nn as nn
import torch.nn.functional as F
from tqdm import tqdm
import matplotlib.pyplot as plt
import os
import keras
from random import choice
import sys
devicet = 'cuda' if torch.cuda.is_available() else 'cpu'
device = torch.device(devicet)
if devicet == 'cpu':
print ('Using CPU')
else:
print ('Using GPU')
cuda0 = torch.device('cuda:0')
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.step1 = nn.Linear(5, 25)
self.step2 = nn.Linear(25, 50)
self.step3 = nn.Linear(50, 100)
self.step4 = nn.Linear(100, 100)
self.step5 = nn.Linear(100, 10)
self.step6 = nn.Linear(10, 1)
def forward(self, x):
x = F.relu(x)
x = self.step1(x)
x = F.relu(x)
x = self.step2(x)
x = F.relu(x)
x = self.step3(x)
x = F.relu(x)
x = self.step4(x)
x = F.relu(x)
x = self.step5(x)
x = F.relu(x)
x = self.step6(x)
x = F.relu(x)
return (x)
net = Net()
x = torch.rand(10,5)
num = choice(range(10))
zero_tensor = torch.zeros(num, 1)
one_tensor = torch.ones(10-num, 1)
y = torch.cat((zero_tensor,one_tensor),0)
x.to(devicet)
y.to(devicet)
learning_rate = 1e-3
optimizer = torch.optim.Adam(net.parameters(), lr=learning_rate)
loss_fn = torch.nn.BCELoss()
acc_list = []
for i in tqdm(range(1000),desc='Training'):
y_pred = net(x)
loss = loss_fn(y_pred, y)
loss.backward()
optimizer.step()
acc_list.append(abs(net(x).detach().numpy()[0]-y.detach().numpy()[0]))
with torch.no_grad():
for param in net.parameters():
param -= learning_rate * param.grad
optimizer.zero_grad()
print ('\nFinished training in {} epochs.'.format(len(acc_list)))
plt.plot(range(len(acc_list)),acc_list)
plt.show()
for i in range(10):
print (str(net(x).detach().numpy()[i][0])+', '+str(y.detach().numpy()[i][0]))
当我运行它时,它始终只打印出以下内容:
为什么它不做任何训练?如果我使用MSE损耗,它会起作用(实际上,它有时只对MSE损耗起作用,有时它会做与图像中相同的事情),只有当我使用BCE时,它才会完全停止工作。最终层激活
您只输出正值,这些正值应介于0
和1
之间。对于初学者,这些值具体如下:
x = F.relu(x)
return (x)
使用torch.sigmoid
和BCELoss
甚至更好,只需输出x
并直接使用logits即可
训练
您在这里使用的是Adam
optimizer并手动执行SGD:
with torch.no_grad():
for param in net.parameters():
param -= learning_rate * param.grad
从本质上讲,您要应用两次优化步骤,这可能太多,可能会破坏权重
optimizer.step()
已经完成了这项工作,无需两者兼而有之强>
精确
本部分:
abs(net(x).detach().numpy()[0]-y.detach().numpy()[0])
我假设您想要计算准确度,它将是这样的(同样不要通过net(x)
两次通过网络推送数据,您已经有y\u pred
!):
就我所知,x和y是完全不相关的。如果没有与
y
相关的信息,您希望如何准确预测y
?@jodag对于这个小示例,它应该适合随机数据anyway@SzymonMaszke我想这是真的,因为y
从不改变。使用这么重的机器来编码一个常数似乎有点奇怪。@jodag适合使用小的随机数据来调试架构/管道,这也不是学习PytorChimp实现它时最糟糕的想法,它工作得很完美,非常感谢!我是PyTorch的新手,这帮了我很大的忙!
# Assuming sigmoid activation
def accuracy(y_pred, y_true):
# For logits use
# predicted_labels = y_pred > 0.0
predicted_labels = y_pred > 0.5
return torch.mean((y_true == predicted_labels).float())