Python Pyrotch将LSTM模型转换为多GPU使用的Dataparallel

Python Pyrotch将LSTM模型转换为多GPU使用的Dataparallel,python,tensorflow,neural-network,pytorch,Python,Tensorflow,Neural Network,Pytorch,我需要将我现有的模型转换为使用DataParallel在多个GPU上运行,我已经阅读文档很多年了,无法准确了解需要更改的内容,因此我需要一些帮助,谢谢 class LSTM(nn.Module): def __init__(self, input_size=1, hidden_layer_size=100, output_size=1): super().__init__() self.hidden_layer_size = hidden_layer_size self

我需要将我现有的模型转换为使用DataParallel在多个GPU上运行,我已经阅读文档很多年了,无法准确了解需要更改的内容,因此我需要一些帮助,谢谢

class LSTM(nn.Module):
def __init__(self, input_size=1, hidden_layer_size=100, output_size=1):
    super().__init__()
    self.hidden_layer_size = hidden_layer_size

    self.lstm = nn.LSTM(input_size, hidden_layer_size)

    self.linear = nn.Linear(hidden_layer_size, output_size)

    self.hidden_cell = (torch.zeros(1,1,self.hidden_layer_size),
                        torch.zeros(1,1,self.hidden_layer_size))

def forward(self, input_seq):
    lstm_out, self.hidden_cell = self.lstm(input_seq.view(len(input_seq) ,1, -1), self.hidden_cell)
    predictions = self.linear(lstm_out.view(len(input_seq), -1))
    return predictions[-1]


model = LSTM()
loss_function = nn.MSELoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.00001)

epochs = 1


 for i in range(epochs):
  count=0
  for seq, labels in train_inout_seq:
    optimizer.zero_grad()
    model.hidden_cell = (torch.zeros(1, 1, model.hidden_layer_size),
                    torch.zeros(1, 1, model.hidden_layer_size))
    y_pred = model(seq)

    single_loss = loss_function(y_pred, labels)
    single_loss.backward()
    optimizer.step()

print("Epoch: %d, loss: %1.5f" % (i, single_loss.item()))

model.eval()

 testPredictions=[]
 realValues=[]
 for seq, labels in test_input_seq:

  with torch.no_grad():
    model.hidden = (torch.zeros(1, 1, model.hidden_layer_size),
                    torch.zeros(1, 1, model.hidden_layer_size))
    testPredictions.append(model(seq).item())
    print(model(seq).item())
    realValues.append(labels)

model=nn.DataParallel(LTSM())
有什么问题?@Berriel导致我出现以下错误:“第二次尝试反向遍历图形,但缓冲区已被释放。第一次调用backward时指定retain_graph=True。”在第一次迭代中将此变量设置为True时,训练过程变得如此缓慢,以至于根本就不是训练,当不使用DataParallel时,这个问题不会发生。你考虑过horovod而不是DataParallel吗?我知道有些人认为使用起来容易得多。