Python 基于PyTorch从头开始问题的线性分类器

Python 基于PyTorch从头开始问题的线性分类器,python,machine-learning,neural-network,pytorch,Python,Machine Learning,Neural Network,Pytorch,我试图在PyTorch中实现线性分类器,使用1层张量W和b,softmax和交叉熵损失。对于每批产品,我必须: 计算登录 使用softmax将logits转换为概率 计算最可能类 计算真实类和预测类之间的交叉熵 使用优化器更改W和b 到目前为止,我已经(我已经用Scikit learn加载了平面MNIST): 由于某种原因,W和b不会改变。我做错了什么 编辑: 我在上面的代码中看到并尝试过e。G这是一个最小的工作示例 编辑2: 梯度W.grad通常是,我认为不应该是这样。类的概率绝对是正确的(所

我试图在PyTorch中实现线性分类器,使用1层张量
W
b
,softmax和交叉熵损失。对于每批产品,我必须:

  • 计算登录
  • 使用softmax将logits转换为概率
  • 计算最可能类
  • 计算真实类和预测类之间的交叉熵
  • 使用优化器更改
    W
    b
  • 到目前为止,我已经(我已经用Scikit learn加载了平面MNIST):

    由于某种原因,
    W
    b
    不会改变。我做错了什么

    编辑: 我在上面的代码中看到并尝试过e。G这是一个最小的工作示例

    编辑2:
    梯度
    W.grad
    通常是,我认为不应该是这样。类的概率绝对是正确的(所以它不是,例如,like),因为我已经检查了每行的和,以及每个样本和的所有类的概率为1。

    您的代码中有拼写错误,所以我无法运行它,但您的问题似乎是,对于线性分类器,您的学习速率太小。尝试学习率为0.01的方法。@jodag我已经纠正了拼写错误,但是是的,你是对的,经过进一步测试,网络似乎正常工作,LR太低了,看起来好像没有训练。
    # convert Numpy arrays to PyTorch tensor Variables
    input_X_train = torch.from_numpy(X_train_flat).float().to(device)
    input_X_val = torch.from_numpy(X_val_flat).float().to(device)
    input_X_test = torch.from_numpy(X_test_flat).float().to(device)
    
    input_y_train = torch.from_numpy(y_train).long().to(device)
    input_y_val = torch.from_numpy(y_val).long().to(device)
    input_y_test = torch.from_numpy(y_test).long().to(device)
    
    # model parameters: W and b
    W = torch.randn(input_dim, output_dim, device=device, dtype=dtype, requires_grad=True)
    b = torch.randn(1, device=device, dtype=dtype, requires_grad=True)
    
    BATCH_SIZE = 512
    EPOCHS = 40
    LEARNING_RATE = 1e-6
    
    # create torch.optim.Adam optimizer for loss function minimization
    optimizer = torch.optim.Adam([W, b], lr=LEARNING_RATE)
    
    # create negative log loss function object for loss function evaluation
    # use mean loss value from all batch samples
    loss_fn = torch.nn.NLLLoss(reduction="mean")
    
    for t in range(EPOCHS):    
        # logits for input_X, resulting shape should be [input_X.shape[0], 10]
        logits = torch.matmul(input_X_train, W) + b
    
        # apply torch.nn.functional.softmax (torch_F.softmax) to logits
        probas = torch_f.softmax(logits, dim=1)
        
        # apply torch.argmax to find a class index with highest probability
        classes = torch.argmax(probas, dim=1)
    
        # loss should be a scalar number: average loss over all the objects with torch.mean()
        # PyTorch implements negative log loss (NLL) *without* log - you have to first compute log of 
        # softmax, then negative log loss, which will swap sign
        
        # Use torch.nn.functional.log_softmax (torch_f.log_softmax) on top of input_y and logits
        # It is identical to calculating cross-entropy (log and then NLL) on top of probas, 
        # but is more numerically friendly (read the docs).
        log_probas = torch_f.log_softmax(logits, dim=1)
        loss = loss_fn(log_probas, input_y_train)
    
        # Before the backward pass, use the optimizer object to zero all of the
        # gradients for the variables it will update (which are the learnable
        # weights of the model). This is because by default, gradients are
        # accumulated in buffers( i.e, not overwritten) whenever .backward()
        # is called. Checkout docs of torch.autograd.backward for more details.
        optimizer.zero_grad()
        
        # calculate backward gradients for backpropagation
        loss.backward()
        
        # Calling the step function on an Optimizer makes an update to its parameters
        optimizer.step()