Python 如何在pytorch中正确绘制同一图形上的train/val
我有以下代码来训练模型并将日志存储在结果变量中Python 如何在pytorch中正确绘制同一图形上的train/val,python,machine-learning,logging,pytorch,Python,Machine Learning,Logging,Pytorch,我有以下代码来训练模型并将日志存储在结果变量中 import tqdm.notebook as tq import sys num_epochs = 10 results = {"train_loss": [], "val_loss": [], "train_acc": [], "val_acc": []} for epoch in range(1, num_epochs+1): sys.stdout.wri
import tqdm.notebook as tq
import sys
num_epochs = 10
results = {"train_loss": [], "val_loss": [], "train_acc": [], "val_acc": []}
for epoch in range(1, num_epochs+1):
sys.stdout.write(f"---Epoch {epoch}/{num_epochs}: ")
epoch_loss = {"train": [], "val": []}
epoch_acc = {"train": [], "val": []}
for phase in ['train', 'val']:
if phase=="train":
model.train(True)
else:
model.train(False)
# most important thing I learned from this project was how to fix tqdm nastiness in colab
for batch_idx, (x, y) in tq.tqdm(enumerate(dataloaders[phase]),
total=len(dataloaders[phase]),
leave=False):
# put data to device and get output
x, y = x.to(device), y.to(device)
preds = model(x)
# calc and log model loss
batch_loss = criterion(preds, y)
epoch_loss[phase].append(batch_loss.item())
# calculate acc and extend to epoch_acc
preds = torch.argmax(preds, dim=1)
batch_acc = torch.sum(preds==y)/len(y)
epoch_acc[phase].append(batch_acc)
# zero the grad
optimizer.zero_grad()
# take a step if training mode is on
if phase=="train":
batch_loss.backward()
optimizer.step()
scheduler.step()
# at the end of each epoch, calculate avg epoch train/val loss/accuracy
train_loss = sum(epoch_loss["train"])/len(epoch_loss["train"])
val_loss = sum(epoch_loss["val"])/len(epoch_loss["val"])
train_acc = 100*sum(epoch_acc["train"])/len(epoch_acc["train"])
val_acc = 100*sum(epoch_acc["val"])/len(epoch_acc["val"])
# log losses and accs every epoch
results['train_loss'].extend(epoch_loss['train'])
results['train_acc'].extend(epoch_acc['train'])
results['val_loss'].extend(epoch_loss['val'])
results['val_acc'].extend(epoch_acc['val'])
# and print it nicely
sys.stdout.write("train_loss: {:.4f} train_acc: {:.2f}% ".format(train_loss, train_acc))
sys.stdout.write("val_loss: {:.4f} val_acc: {:.2f}%\n".format(val_loss, val_acc))
我将每个批次的平均精度和平均损耗记录到单独的培训/验证损耗/acc阵列中。问题是我有更多的培训批次,因此当我尝试绘制培训日志时,我会得到如下结果:
有解决方法吗?您在概念上犯了一些错误:
n_历元
值,并可以将它们绘制在同一轴上。您期望的“解决方案”是什么?绘制历元而不是批次分数,或者使用不同的绘图。否则,您可以重复val绘图上的值,但这完全超出了绘图的目的。另外,请参阅。在验证模式下,您可以禁用渐变,验证将运行得更快,占用的内存更少。另外,您不需要在验证阶段运行optimizer.zero\u grad()