Deep learning 利用PyTorch得到Adam模型中的nan损失

Deep learning 利用PyTorch得到Adam模型中的nan损失,deep-learning,pytorch,adam,Deep Learning,Pytorch,Adam,我对训练神经网络还不熟悉。如果这是一个非常愚蠢的问题,或者违反了任何未声明的堆栈溢出规则,请原谅我。我最近开始研究泰坦尼克号的数据集。我清理了数据。我有一个特征张量,我把标准化的连续数据和分类数据的一个热张量连接起来。我将这些数据传递到一个简单的线性模型中,得到了所有时期的nan损失 import torch import torch.nn as nn import torch.nn.functional as F import torch.optim as optim from tqdm im

我对训练神经网络还不熟悉。如果这是一个非常愚蠢的问题,或者违反了任何未声明的堆栈溢出规则,请原谅我。我最近开始研究泰坦尼克号的数据集。我清理了数据。我有一个特征张量,我把标准化的连续数据和分类数据的一个热张量连接起来。我将这些数据传递到一个简单的线性模型中,得到了所有时期的nan损失

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from tqdm import tqdm
import pickle
import pathlib

path = pathlib.Path('./drive/My Drive/Kaggle/Titanic')

with open(path/'feature_tensor.pickle', 'rb') as f:
    features = pickle.load(f)

with open(path/'label_tensor.pickle', 'rb') as f:
    labels = pickle.load(f)

features = features.float()
labels = labels.float()

import math
valid_size = -1 * math.floor(0.2*len(features))

train_features = features[:valid_size]
valid_features = features[valid_size:]

train_labels = labels[:valid_size]
valid_labels = labels[valid_size:]

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.h_l1 = nn.Linear(18, 64)
        self.h_l2 = nn.Linear(64, 32)
        self.o_l = nn.Linear(32, 2)

    def forward(self, x):
        x = F.relu(self.h_l1(x))
        x = F.relu(self.h_l2(x))
        return self.o_l(x)

model = Model()
model.to('cuda')

optimizer = optim.Adam(model.parameters())
loss_fn = nn.MSELoss()

EPOCHS = 5
BATCH_SIZE = 20

for EPOCH in range(0, EPOCHS):
    for i in tqdm(range(0, len(features), BATCH_SIZE)):
        train_feature_batch = train_features[i:i+BATCH_SIZE,:].to('cuda')
        train_label_batch = train_labels[i:i+BATCH_SIZE,:].to('cuda')
        valid_feature_batch = valid_features[i:i+BATCH_SIZE,:].to('cuda')
        valid_label_batch = valid_labels[i:i+BATCH_SIZE,:].to('cuda')
        train_loss = loss_fn(model(train_feature_batch), train_label_batch)
        with torch.no_grad():
            valid_loss = loss_fn(model(valid_feature_batch), valid_label_batch)
        optimizer.zero_grad()
        train_loss.backward()
        optimizer.step()
    print(f"Epoch : {EPOCH}\tTrain Loss : {train_loss}\tValid_loss : {valid_loss}\n")
我得到以下输出:

100%|██████████| 45/45 [00:00<00:00, 511.50it/s]
100%|██████████| 45/45 [00:00<00:00, 604.10it/s]
100%|██████████| 45/45 [00:00<00:00, 586.21it/s]
  0%|          | 0/45 [00:00<?, ?it/s]Epoch : 0 Train Loss : nan    Valid_loss : nan

Epoch : 1   Train Loss : nan    Valid_loss : nan

Epoch : 2   Train Loss : nan    Valid_loss : nan

100%|██████████| 45/45 [00:00<00:00, 555.55it/s]
100%|██████████| 45/45 [00:00<00:00, 607.65it/s]Epoch : 3   Train Loss : nan    Valid_loss : nan

Epoch : 4   Train Loss : nan    Valid_loss : nan

100%|██████████| 45/45 预测和标签有相同的形状吗?也可以考虑使用PyRoatDataLoad或DataSet API。如果你的代码看起来像PyTrink那样,那么就有问题了,因为PyTrink是让代码看起来比这个更好。我不是在批评,但是我认为使用DAT这样的Py火炬功能会更好。aset和数据加载器对不起,我在这里只是一个初学者。我一定会尝试使用数据加载器和数据集。没问题,看看这里我读了一本书“使用PyTorch进行深入学习”从中我学到了很多东西,我试着把一批20个特征放入模型中,找出输出和损失。这些值都很好。当我在整个数据集和各个时代都这样做时,我得到了这个错误。