Pytorch Pytork:DataParallel can'；t在总批量大小相同的情况下减少模型推理时间_Pytorch

Pytorch Pytork:DataParallel can'；t在总批量大小相同的情况下减少模型推理时间

pytorch

Pytorch Pytork:DataParallel can'；t在总批量大小相同的情况下减少模型推理时间,pytorch,Pytorch,当我使用torch.nn.DataParallel（）实现数据并行计算时，我发现对于相同的总量批量大小，并行模型的推理时间与串行模型相比没有显著减少如下图所示 model = SNModel() criterion = nn.CrossEntropyLoss() if parallel_enable: model = nn.DataParallel(model, device_ids=gpu_ids) # gpu_ids = [0,1,2,3] total 4 gpus model.t

当我使用

torch.nn.DataParallel（）

实现数据并行计算时，我发现对于相同的总量批量大小，并行模型的推理时间与串行模型相比没有显著减少

如下图所示

model = SNModel()
criterion = nn.CrossEntropyLoss()
if parallel_enable:
    model = nn.DataParallel(model, device_ids=gpu_ids) # gpu_ids = [0,1,2,3] total 4 gpus
model.to(args.device)
criterion.to(args.device)

计算100次迭代模型推断时间：

torch.cuda.synchronize()
st = time.time()
outputs, loss = model(images, path, labels, criterion, gpu_nums)
torch.cuda.synchronize()
total_tm += time.time() - st

使用单个gpu：

100 iter model time  23s (around)

使用4个GPU：

100 iter model time  103s (around)

只有

103-23*4=11s

减少

有什么不对吗？希望你能帮助我！这件事困扰了我很长时间。我知道有些事我没有发现。这种现象已经出现在其他模型中，所以我总是放弃并行的使用