Python Can'；t使用带有pytorch的gpu训练ResNet_Python_Pytorch_Gpu_Resnet

Python Can'；t使用带有pytorch的gpu训练ResNet

python pytorch

Python Can'；t使用带有pytorch的gpu训练ResNet,python,pytorch,gpu,resnet,Python,Pytorch,Gpu,Resnet,我正在尝试使用gpu在CIFAR10数据集上训练ResNet体系结构。以下是我的ResNet代码： import torch import torch.nn as nn import torch.nn.functional as F 然后我使用gpu训练网络： net = ResNet18() net = net.to('cuda') train2(net, torch.optim.Adam(net.parameters(), lr=0.001), trainloader, criteri

我正在尝试使用gpu在CIFAR10数据集上训练ResNet体系结构。以下是我的ResNet代码：

import torch
import torch.nn as nn
import torch.nn.functional as F

然后我使用gpu训练网络：


net = ResNet18()
net = net.to('cuda')
train2(net, torch.optim.Adam(net.parameters(), lr=0.001), trainloader, criterion, n_ep=3)

我得到了一个错误：

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

这很烦人，因为我的权重也应该是cuda，因为resnet.cuda（）

在另一个网络中，列车功能运行良好，因此它必须来自上述类别

另外，next（resnet.parameters（））.is\u cuda返回True

更新：这是我的培训功能


def train(net, optimizer, trainload, criterion, n_ep=10, cuda = True):
  if cuda:
    net = net.to('cuda')


  for epoch in range(n_ep):
    for data in trainload:

      inputs, labels = data
      if cuda:
        inputs = inputs.type(torch.cuda.FloatTensor)
        labels = labels.type(torch.cuda.LongTensor)


      optimizer.zero_grad()

      print(next(net.parameters()).is_cuda)
      ## this actually prints "True" ! 



      outputs = net.forward(inputs)
      loss = criterion(outputs, labels)
      loss.backward()
      optimizer.step()

  return net

问题是，这种训练功能与另一种类型的网络配合得很好。例如，使用的是这个（AlexNet）：

有了这个，gpu的训练效果很好

还有件事我不明白。我试图用我没有（故意）移动到GPU的训练数据来训练我移动到GPU（使用.cuda（））的网络。这次我得到了一个错误，权重类型是torch.cuda，数据类型不是

编辑：我认为这与使用nn.ModuleList而不是常规python列表有关。但是我试过了，它并没有解决这个问题

我们需要您的训练循环片段，以便更好地确定您的错误

我估计在这个循环的某个地方有一些代码行可以执行以下操作：

for data, label in CifarDataLoader:
     data, label = data.to('cuda'), label.to('cuda')

我的第一个猜测是在for循环之前添加一行->

resnet = resnet.to('cuda')

让我知道这是否可行，如果不行，我需要更多的代码来查找错误。

好的，我终于找到了

我在ResNetBlock类的forward函数中定义了一些nn.Module对象。我猜这些不能移动到gpu，因为pytorch只在init函数中查找这样的对象。我对实现做了一点修改，在init函数中定义了对象，结果成功了

谢谢您的帮助：）

谢谢您发布您的答案，我自己也不知道，所以将来可能会派上用场。


def train(net, optimizer, trainload, criterion, n_ep=10, cuda = True):
  if cuda:
    net = net.to('cuda')


  for epoch in range(n_ep):
    for data in trainload:

      inputs, labels = data
      if cuda:
        inputs = inputs.type(torch.cuda.FloatTensor)
        labels = labels.type(torch.cuda.LongTensor)


      optimizer.zero_grad()

      print(next(net.parameters()).is_cuda)
      ## this actually prints "True" ! 



      outputs = net.forward(inputs)
      loss = criterion(outputs, labels)
      loss.backward()
      optimizer.step()

  return net

class AlexNet(nn.Module):

    def __init__(self, num_classes=1000):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(nn.Conv2d(3,64,11), nn.ReLU(),nn.MaxPool2d(2, stride = 2), nn.Conv2d(64,192,5),
                                     nn.ReLU(), nn.MaxPool2d(2, stride = 2), nn.Conv2d(192,384,3),
                                     nn.ReLU(),nn.Conv2d(384,256,3), nn.ReLU(), nn.Conv2d(256,256,3), nn.ReLU())
        self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, num_classes),)

    def forward(self, x):
        x = self.features(x)
        x = self.avgpool(x)
        x = x.view(x.size(0), 256 * 6 * 6)
        x = self.classifier(x)
        return x

for data, label in CifarDataLoader:
     data, label = data.to('cuda'), label.to('cuda')

resnet = resnet.to('cuda')