预训练pytorch vgg16模型分类及其分类

预训练pytorch vgg16模型分类及其分类,pytorch,classification,torch,vgg-net,torchvision,Pytorch,Classification,Torch,Vgg Net,Torchvision,我用pytorch的预训练vgg16模型编写了一个图像vgg分类模型 import matplotlib.pyplot as plt import numpy as np import torch from PIL import Image import urllib from skimage.transform import resize from skimage import io import yaml # Downloading imagenet 1000 classes list f

我用pytorch的预训练vgg16模型编写了一个图像vgg分类模型

import matplotlib.pyplot as plt
import numpy as np
import torch
from PIL import Image
import urllib
from skimage.transform import resize
from skimage import io
import yaml

# Downloading imagenet 1000 classes list
file = urllib. request. urlopen("https://gist.githubusercontent.com/yrevar/942d3a0ac09ec9e5eb3a/raw/238f720ff059c1f82f368259d1ca4ffa5dd8f9f5/imagenet1000_clsidx_to_labels.txt")
classes = ''
for f in file:
  classes = classes +  f.decode("utf-8")
classes = yaml.load(classes)

# Downloading pretrained vgg16 model
model = torch.hub.load('pytorch/vision:v0.6.0', 'vgg16', pretrained=True)

print(model)

for param in model.parameters():
    param.requires_grad = False


url, filename = ("https://raw.githubusercontent.com/pytorch/hub/master/dog.jpg", "dog.jpg")

image=io.imread(url)

plt.imshow(image)
plt.show()

# resize to 224x224x3
img = resize(image,(224,224,3))

plt.imshow(img)
plt.show()
# Normalizing input for vgg16
mean = [0.485, 0.456, 0.406]
std = [0.229, 0.224, 0.225]
img1 = mean*img+std
img1 = np.clip(img1,0,1)

img1 = torch.from_numpy(img1).unsqueeze(0)
img1 = img1.permute(0,3,2,1) # batch_size x channels x height x width

model.eval()
pred = model(img1.float())
print(classes[torch.argmax(pred).numpy().tolist()])

代码工作正常,但输出的类错误。我不确定我哪里做错了,但如果我不得不猜测它可能是imagenet yaml类列表或在规范化输入图像处。有人能告诉我哪里出错了吗?

图像预处理有一些问题。首先,归一化计算为
(值-平均值)/std)
,而不是
值*平均值+std
。其次,不应将值剪裁为[0,1],归一化有意将值从[0,1]移开。其次,作为NumPy数组的图像具有shape[height,width,3],当您排列维度时,您交换了height和width维度,创建了一个带有shape[batch_size,channels,width,height]的张量

img=resize(图像,(224,3))
#规范化vgg16的输入
平均值=[0.485,0.456,0.406]
标准=[0.229,0.224,0.225]
img1=(img1-平均值)/std
img1=火炬从(img1)开始。取消queze(0)
img1=img1.排列(0,3,1,2)#批量大小x通道x高度x宽度
您可以使用

来自torchvision导入转换的

preprocess=transforms.Compose([
transforms.ToTensor(),
标准化(平均值=[0.485,0.456,0.406],标准值=[0.229,0.224,0.225])
])
img=调整大小(图像,(224,3))
img1=预处理(img)
img1=img1.取消查询(0)
如果使用PIL加载图像,还可以通过添加到预处理管道来调整图像大小,或者甚至可以添加以首先将图像转换为PIL图像(
transforms.resize
需要PIL图像)