Python 运行时错误:CUDA错误:触发设备端断言-PyTorch上的Resnet18
我正在尝试将PyTorch的Resnet18模型用于我的图像数据。考虑到模型的复杂性和数据的大小,我想使用CUDA运行它。我正在做以下工作:Python 运行时错误:CUDA错误:触发设备端断言-PyTorch上的Resnet18,python,pytorch,Python,Pytorch,我正在尝试将PyTorch的Resnet18模型用于我的图像数据。考虑到模型的复杂性和数据的大小,我想使用CUDA运行它。我正在做以下工作: resnet_cnn = models.resnet18(pretrained = True) num_ftrs = resnet_cnn.fc.in_features resnet_cnn.fc = nn.Linear(num_ftrs, 8) criterion = nn.CrossEntropyLoss().cuda() optimizer_ft
resnet_cnn = models.resnet18(pretrained = True)
num_ftrs = resnet_cnn.fc.in_features
resnet_cnn.fc = nn.Linear(num_ftrs, 8)
criterion = nn.CrossEntropyLoss().cuda()
optimizer_ft = optim.SGD(resnet_cnn.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_ft, step_size=5, gamma=0.1)
在此之后,我尝试使用以下循环来训练和测试我的模型:
count = 0
loss_list = []
iteration_list = []
accuracy_list = []
epochs = 30
for epoch in range(epochs):
for i, (images, labels) in enumerate(trainloader):
resnet_cnn = resnet_cnn.cuda()
images.cuda()
labels.cuda()
optimizer_ft.zero_grad()
outputs = resnet_cnn(images.cuda())
loss = criterion(outputs.cuda(), labels.cuda())
loss.backward()
optimizer_ft.step()
count += 1
if count % 50 == 0:
correct = 0
total = 0
for i, (images, labels) in enumerate(testloader):
# images.to(device)
# labels.to(device)
outputs = resnet_cnn(images.cuda())
predicted = torch.max(outputs.data, 1)[1]
total += len(labels)
correct += (predicted == labels.cuda()).sum()
accuracy = 100 * correct / float(total)
loss_list.append(loss.data)
iteration_list.append(count)
accuracy_list.append(accuracy)
if count % 500 == 0:
print("Iteration: {} Loss: {} Accuracy: {} %".format(count, loss.data, accuracy))
但我遇到了以下错误:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-48-cb669e8d47c0> in <module>()
7 for epoch in range(epochs):
8 for i, (images, labels) in enumerate(trainloader):
----> 9 resnet_cnn = resnet_cnn.cuda()
10 images.cuda()
11 labels.cuda()
3 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in cuda(self, device)
489 Module: self
490 """
--> 491 return self._apply(lambda t: t.cuda(device))
492
493 def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
385 def _apply(self, fn):
386 for module in self.children():
--> 387 module._apply(fn)
388
389 def compute_should_use_set_data(tensor, tensor_applied):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _apply(self, fn)
407 # `with torch.no_grad():`
408 with torch.no_grad():
--> 409 param_applied = fn(param)
410 should_use_set_data = compute_should_use_set_data(param, param_applied)
411 if should_use_set_data:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in <lambda>(t)
489 Module: self
490 """
--> 491 return self._apply(lambda t: t.cuda(device))
492
493 def xpu(self: T, device: Optional[Union[int, device]] = None) -> T:
RuntimeError: CUDA error: device-side assert triggered
---------------------------------------------------------------------------
运行时错误回溯(上次最近调用)
在()
7对于范围内的历元(历元):
8对于枚举(火车装载机)中的i(图像、标签):
---->9 resnet_cnn=resnet_cnn.cuda()
10张图片。cuda()
11.cuda()
3帧
/cuda中的usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py(自我,设备)
489模块:自我
490 """
-->491返回自应用(lambda t:t.cuda(设备))
492
493 def xpu(self:T,device:Optional[Union[int,device]]=None)->T:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in_apply(self,fn)
385 def_应用(自,fn):
386对于self.children()中的模块:
-->387模块应用(fn)
388
389 def compute_应使用_set_数据(张量、张量应用):
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in_apply(self,fn)
407#`带手电筒,无梯度():`
408带火炬。无梯度()
-->409应用的参数=fn(参数)
410应该使用设置数据=计算应该使用设置数据(参数,参数已应用)
411如果应使用设置数据:
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in(t)
489模块:自我
490 """
-->491返回自应用(lambda t:t.cuda(设备))
492
493 def xpu(self:T,device:Optional[Union[int,device]]=None)->T:
运行时错误:CUDA错误:已触发设备端断言
我不知道我做错了什么,因为我用同样的方法训练了一个手动定义的CNN。提前谢谢。您能试着在CPU中运行代码吗?通常,当代码在CPU中运行而不是在GPUTH中运行时,输出更加自我解释。问题在其他行中,您必须将所有内容设置为CPU,以使stacktrace正常工作Hello@Francescoalogi,@NatthaphonHongcharoen-如果我在CPU上运行该模型,则工作正常,但在循环中的处理速度非常缓慢。如果在CPU上运行,则不会出现错误。我误解你了吗?请检查这行返回的内容
torch.cuda.is_available()
Hello@PrajotKuvalekar-torch.cuda.is_available()
返回True