Pytorch:参数的梯度保持为0
在一个作业中,我必须为贝叶斯逻辑回归实现变分推理,我在进行优化步骤时遇到困难,因为梯度一直为0 总结一下什么是变分推理:Pytorch:参数的梯度保持为0,pytorch,logistic-regression,Pytorch,Logistic Regression,在一个作业中,我必须为贝叶斯逻辑回归实现变分推理,我在进行优化步骤时遇到困难,因为梯度一直为0 总结一下什么是变分推理: 我们想用高斯分布近似未知的后验定律 我们通过定义一个高斯函数并优化其均值和协方差矩阵来实现这一点,该矩阵相对于高斯函数本身和后验函数之间的库尔贝克散度 我们可以证明最小化Kullback散度等价于最小化可计算的de NELBO 这些参数是我们用来近似真实后验概率的高斯分布的均值和协方差矩阵(简化为对角线矩阵,对角线上只有一个值)。这是一个向量和一个标量 损失是NELBO
- 我们想用高斯分布近似未知的后验定律
- 我们通过定义一个高斯函数并优化其均值和协方差矩阵来实现这一点,该矩阵相对于高斯函数本身和后验函数之间的库尔贝克散度
- 我们可以证明最小化Kullback散度等价于最小化可计算的de NELBO
- 这些参数是我们用来近似真实后验概率的高斯分布的均值和协方差矩阵(简化为对角线矩阵,对角线上只有一个值)。这是一个向量和一个标量
- 损失是NELBO
class LogisticRegression(nn.Module):
def __init__(self, input_dim):
super(LogisticRegression, self).__init__()
self.prior_w = NormalDiagonal(d=input_dim)
self.posterior_w = NormalDiagonal(d=input_dim)
@args_as_tensors(1)
def predict_y(self, X, mc_samples=1):
w_samples = self.posterior_w.sample(mc_samples)
y_samples = logistic(torch.mm(X, w_samples.T)).mean(axis=1)
return y_samples
下面是计算NELBO损失的类:
class VariationalObjective(nn.Module):
def __init__(self, model, likelihood, N, mc_samples=1):
super(VariationalObjective, self).__init__()
self.N = N
self.model = model
self.likelihood = likelihood
self.mc_samples = mc_samples
def expected_loglikelihood(self, Xbatch, ybatch):
ypred = model.predict_y(Xbatch, self.mc_samples)
ybatch = torch.tensor(ybatch.reshape((ybatch.shape[0],))) # Makes it cleaner for later
logliks = likelihood.logdensity(ybatch, ypred)
return (1/self.mc_samples)*(self.N / len(Xbatch) ) * torch.sum(logliks)
def kl(self):
return kl_divergence(model.posterior_w, model.prior_w)
def compute_objective(self, Xbatch, ybatch):
return - self.expected_loglikelihood(Xbatch, ybatch) + self.kl()
以下是我的优化步骤:
data = genfromtxt('data/binaryclass2.csv', delimiter=',')
X = data[...,:-1]
y = data[...,-1].reshape(-1,1)
dataset = Dataset(X, y, minibatch_size=1000)
likelihood = Bernoulli()
model = LogisticRegression(X.shape[1])
nelbo = VariationalObjective(model, likelihood, X.shape[1], mc_samples= 10)
optim = torch.optim.SGD([model.posterior_w.mean, model.posterior_w.logvar], lr=0.001)
num_iterations = 50
for step in range(num_iterations):
optim.zero_grad()
Xbatch, ybatch = dataset.next_batch()
loss = nelbo.compute_objective(Xbatch, ybatch)
loss.backward()
print(model.posterior_w.mean.grad) # PRINT 0
print(model.posterior_w.logvar.grad) # PRINT 0
optim.step()
我不知道我做错了什么,也许pytorch不能跟随所有这些嵌套的实例化
多谢各位
data = genfromtxt('data/binaryclass2.csv', delimiter=',')
X = data[...,:-1]
y = data[...,-1].reshape(-1,1)
dataset = Dataset(X, y, minibatch_size=1000)
likelihood = Bernoulli()
model = LogisticRegression(X.shape[1])
nelbo = VariationalObjective(model, likelihood, X.shape[1], mc_samples= 10)
optim = torch.optim.SGD([model.posterior_w.mean, model.posterior_w.logvar], lr=0.001)
num_iterations = 50
for step in range(num_iterations):
optim.zero_grad()
Xbatch, ybatch = dataset.next_batch()
loss = nelbo.compute_objective(Xbatch, ybatch)
loss.backward()
print(model.posterior_w.mean.grad) # PRINT 0
print(model.posterior_w.logvar.grad) # PRINT 0
optim.step()