Python 强制Pytork使用gpu_Python_Neural Network_Pytorch

Python 强制Pytork使用gpu

python neural-network pytorch

Python 强制Pytork使用gpu,python,neural-network,pytorch,Python,Neural Network,Pytorch,我最近在这里学习了一个教程并以一段功能性代码结束，该代码生成概述感兴趣对象的掩码但现在，我想在我的gpu上运行它，因为cpu太慢了我已经安装了CUDA等，但pytorch拒绝使用它。我用过很多技巧，比如设置火炬、装置等等，但都没有用；pytorch继续使用0 gpu 代码如下： from PIL import Image import torch import torchvision.transforms as T from torchvision import models import

我最近在这里学习了一个教程

并以一段功能性代码结束，该代码生成概述感兴趣对象的掩码

但现在，我想在我的gpu上运行它，因为cpu太慢了

我已经安装了CUDA等，但pytorch拒绝使用它。我用过很多技巧，比如设置火炬、装置等等，但都没有用；pytorch继续使用0 gpu

代码如下：

from PIL import Image
import torch
import torchvision.transforms as T
from torchvision import models
import numpy as np

fcn = None


device = torch.device('cuda')
torch.cuda.set_device(0)
print('Using device:', device)
print()

if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:', round(torch.cuda.memory_cached(0)/1024**3,1), 'GB')


def getRotoModel():
    global fcn
    #fcn = models.segmentation.fcn_resnet101(pretrained=True).eval()
    fcn = models.segmentation.deeplabv3_resnet101(pretrained=True).eval()


# Define the helper function
def decode_segmap(image, nc=21):

    label_colors = np.array([(0, 0, 0),  # 0=background
                           # 1=aeroplane, 2=bicycle, 3=bird, 4=boat, 5=bottle
               (128, 0, 0), (0, 128, 0), (128, 128, 0), (0, 0, 128), (128, 0, 128),
               # 6=bus, 7=car, 8=cat, 9=chair, 10=cow
               (0, 128, 128), (128, 128, 128), (64, 0, 0), (192, 0, 0), (64, 128, 0),
               # 11=dining table, 12=dog, 13=horse, 14=motorbike, 15=person
               (192, 128, 0), (64, 0, 128), (192, 0, 128), (64, 128, 128), (192, 128, 128),
               # 16=potted plant, 17=sheep, 18=sofa, 19=train, 20=tv/monitor
               (0, 64, 0), (128, 64, 0), (0, 192, 0), (128, 192, 0), (0, 64, 128)])

    r = np.zeros_like(image).astype(np.uint8)
    g = np.zeros_like(image).astype(np.uint8)
    b = np.zeros_like(image).astype(np.uint8)

    for l in range(0, nc):
        idx = image == l
        r[idx] = label_colors[l, 0]
        g[idx] = label_colors[l, 1]
        b[idx] = label_colors[l, 2]

    rgb = np.stack([r, g, b], axis=2)
    return rgb

def createMatte(filename, matteName, size):
    img = Image.open(filename)
    trf = T.Compose([T.Resize(size),
                     T.ToTensor(), 
                     T.Normalize(mean = [0.485, 0.456, 0.406], 
                                 std = [0.229, 0.224, 0.225])])
    inp = trf(img).unsqueeze(0)
    if (fcn == None): getRotoModel()
    out = fcn(inp)['out']
    om = torch.argmax(out.squeeze(), dim=0).detach().cpu().numpy()
    rgb = decode_segmap(om)
    im = Image.fromarray(rgb)
    im.save(matteName)

我能做什么？谢谢。

如果一切设置正确，只需将要在gpu上处理的张量移动到gpu即可。您可以尝试一下，以确保它在一般情况下正常工作

import torch
t = torch.tensor([1.0]) # create tensor with just a 1 in it
t = t.cuda() # Move t to the gpu
print(t) # Should print something like tensor([1], device='cuda:0')
print(t.mean()) # Test an operation just to be sure

您已经有了一个

设备

变量，因此可以使用

.to（设备）

而不是

.cuda（）

。这也是一种更可取的方法，因此您可以通过设置一个变量在cpu和gpu之间切换。

您是否将任何张量移动到gpu（通过使用

.cuda（）

）或在gpu上创建了张量？mh我认为此代码不会这样做？如果我理解的话，张量就像GPU矩阵，所以我应该把所有的numpy数组重写成cuda张量？因为我认为这意味着重写整件事，你说它拒绝使用cuda但继续使用GPU0是什么意思？如果它使用GPU0，那么它也使用cuda。您是否正在寻找多gpu支持？例如

nn.DataParallel

？@jodag对不起。“使用0 GPU”意味着根本不使用任何GPU。很抱歉当我运行get_device_name时，我的gpu会显示出来，但从所花的时间和windows性能可以看出gpu是空闲的，如下所示。在

getRotoModel（）

中，在末尾添加行

fcn.cuda（）

，并将

fcn（inp）['out']

更改为

fcn（inp.cuda（））['out']

。如果你想使用GPU，你需要将模型和输入张量移动到GPU。所以我需要重写大部分文件，如果不是全部的话，对吗？我希望会有一个“在gpu上计算所有东西”的开关，你真的不需要重写所有东西。只需在（设备）上添加一些

，从代码上看，对于inp
张量和fcn
模型来说，这可能就足够了。因此，网络将在gpu上运行。我想这是CPU上计算时间最长的部分。