Python torch.distributed.launch如何将数据分配给每个GPU？_Python_Pytorch

Python torch.distributed.launch如何将数据分配给每个GPU？

python pytorch

Python torch.distributed.launch如何将数据分配给每个GPU？,python,pytorch,Python,Pytorch,当我们的批量大小为1或2且我们有8个GPU时，如何torch.distributed.launch为每个GPU分配数据？我将我的模型转换为torch.nn.parallel.DistributedDataParallel model = DistributedDataParallel(model, device_ids=[args.local_rank], output

当我们的批量大小为1或2且我们有8个GPU时，如何

torch.distributed.launch

为每个GPU分配数据？我将我的模型转换为

torch.nn.parallel.DistributedDataParallel

model = DistributedDataParallel(model,
                                device_ids=[args.local_rank],
                                output_device=args.local_rank,
                                find_unused_parameters=True,
                                )

但它在报告中指出，分布式数据并行：

通过以下方式并行给定模块的应用程序：通过在批处理维度

我的问题是，当批处理大小小于GPU数量时，它如何处理它？

它们不会。与

Dataparallel

不同，您设置的批处理大小是每个GPU的。当您有8个GPU且批大小为1时，您的有效批大小为8。

感谢您的回复。这是否意味着文档过时了？@Amir我想他们的意思是“有效批量”。我一直认为DDP的文档很混乱。关键是要认识到每个GPU都有一个主要的Python进程——每个GPU都在进行数据加载和丢失计算——并且只能通过

DistributedDataParallel

和

DistributedSampler

进行通信。