Python 运行时错误:CUDA错误:触发设备端断言-PyTorch上的Resnet18

Python 运行时错误:CUDA错误:触发设备端断言-PyTorch上的Resnet18,python,pytorch,Python,Pytorch,我正在尝试将PyTorch的Resnet18模型用于我的图像数据。考虑到模型的复杂性和数据的大小,我想使用CUDA运行它。我正在做以下工作: resnet_cnn = models.resnet18(pretrained = True) num_ftrs = resnet_cnn.fc.in_features resnet_cnn.fc = nn.Linear(num_ftrs, 8) criterion = nn.CrossEntropyLoss().cuda() optimizer_ft

我正在尝试将PyTorch的Resnet18模型用于我的图像数据。考虑到模型的复杂性和数据的大小,我想使用CUDA运行它。我正在做以下工作:

resnet_cnn = models.resnet18(pretrained = True)
num_ftrs = resnet_cnn.fc.in_features
resnet_cnn.fc = nn.Linear(num_ftrs, 8)

criterion = nn.CrossEntropyLoss().cuda()
optimizer_ft = optim.SGD(resnet_cnn.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=5, gamma=0.1)

在此之后,我尝试使用以下循环来训练和测试我的模型:

count = 0
loss_list = []
iteration_list = []
accuracy_list = []
epochs = 30

for epoch in range(epochs):
    for i, (images, labels) in enumerate(trainloader):
            resnet_cnn = resnet_cnn.cuda()
            images.cuda()
            labels.cuda()

            optimizer_ft.zero_grad()
            outputs = resnet_cnn(images.cuda())
            loss = criterion(outputs.cuda(), labels.cuda())
            loss.backward()
            optimizer_ft.step()

            count += 1

            if count % 50 == 0:
                correct = 0
                total = 0

                for i, (images, labels) in enumerate(testloader):
                    # images.to(device)
                    # labels.to(device)

                    outputs = resnet_cnn(images.cuda())
                    predicted = torch.max(outputs.data, 1)[1]
                    total += len(labels)
                    correct += (predicted == labels.cuda()).sum()
                accuracy = 100 * correct / float(total)

                loss_list.append(loss.data)
                iteration_list.append(count)
                accuracy_list.append(accuracy)

                if count % 500 == 0:
                    print("Iteration: {} Loss: {} Accuracy: {} %".format(count, loss.data, accuracy))
但我遇到了以下错误:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-48-cb669e8d47c0> in <module>()
      7 for epoch in range(epochs):
      8     for i, (images, labels) in enumerate(trainloader):
----> 9             resnet_cnn = resnet_cnn.cuda()
     10             images.cuda()
     11             labels.cuda()

3 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in cuda(self, device)
    489             Module: self
    490         """
--> 491         return self._apply(lambda t: t.cuda(device))
    492 
    493     def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    385     def _apply(self, fn):
    386         for module in self.children():
--> 387             module._apply(fn)
    388 
    389         def compute_should_use_set_data(tensor, tensor_applied):

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
    407                 # `with torch.no_grad():`
    408                 with torch.no_grad():
--> 409                     param_applied = fn(param)
    410                 should_use_set_data = compute_should_use_set_data(param, param_applied)
    411                 if should_use_set_data:

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in <lambda>(t)
    489             Module: self
    490         """
--> 491         return self._apply(lambda t: t.cuda(device))
    492 
    493     def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:

RuntimeError: CUDA error: device-side assert triggered

---------------------------------------------------------------------------
运行时错误回溯(上次最近调用)
在()
7对于范围内的历元(历元):
8对于枚举(火车装载机)中的i(图像、标签):
---->9 resnet_cnn=resnet_cnn.cuda()
10张图片。cuda()
11.cuda()
3帧
/cuda中的usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py(自我,设备)
489模块:自我
490         """
-->491返回自应用(lambda t:t.cuda(设备))
492
493 def xpu(self:T,device:Optional[Union[int,device]]=None)->T:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in_apply(self,fn)
385 def_应用(自,fn):
386对于self.children()中的模块:
-->387模块应用(fn)
388
389 def compute_应使用_set_数据(张量、张量应用):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in_apply(self,fn)
407#`带手电筒,无梯度():`
408带火炬。无梯度()
-->409应用的参数=fn(参数)
410应该使用设置数据=计算应该使用设置数据(参数,参数已应用)
411如果应使用设置数据:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in(t)
489模块:自我
490         """
-->491返回自应用(lambda t:t.cuda(设备))
492
493 def xpu(self:T,device:Optional[Union[int,device]]=None)->T:
运行时错误:CUDA错误:已触发设备端断言

我不知道我做错了什么,因为我用同样的方法训练了一个手动定义的CNN。提前谢谢。

您能试着在CPU中运行代码吗?通常,当代码在CPU中运行而不是在GPUTH中运行时,输出更加自我解释。问题在其他行中,您必须将所有内容设置为CPU,以使stacktrace正常工作Hello@Francescoalogi,@NatthaphonHongcharoen-如果我在CPU上运行该模型,则工作正常,但在循环中的处理速度非常缓慢。如果在CPU上运行,则不会出现错误。我误解你了吗?请检查这行返回的内容
torch.cuda.is_available()
Hello@PrajotKuvalekar-
torch.cuda.is_available()
返回
True