Python 类型错误:img应为PIL图像。得到<;类别';numpy.ndarray和#x27>;-皮托克

Python 类型错误:img应为PIL图像。得到<;类别';numpy.ndarray和#x27>;-皮托克,python,numpy,opencv,pytorch,Python,Numpy,Opencv,Pytorch,我正在准备一些图像数据,以便我的神经系统进行分类。作为图像预处理步骤的一部分,我将在我的dataset类中应用HOG过滤器,如下所示: class GetHogData(Dataset): def __init__(self, df, root, transform = None): self.df = df self.root = root self.transform = transform def __len__(self): return le

我正在准备一些图像数据,以便我的神经系统进行分类。作为图像预处理步骤的一部分,我将在我的dataset类中应用HOG过滤器,如下所示:

class GetHogData(Dataset):

  def __init__(self, df, root, transform = None):
    self.df = df
    self.root = root
    self.transform = transform

  def __len__(self):
    return len(self.df)

  def __getitem__(self, idx):
    if torch.is_tensor(idx):
      idx = idx.tolist()

    img_path = os.path.join(self.root, self.df.iloc[idx, 0])
    # image = Image.open(img_path)
    image = cv2.imread(img_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    label = self.df.iloc[idx, 1]

    if self.transform:
      image = self.transform(image)

    hog_, hog_image = hog(
        image,
        orientations = 9,
        pixels_per_cell = (14,14),
        cells_per_block = (2,2),
        block_norm = "L1")
    
    image = np.transpose(image, (2, 0, 1))

    img_hog_lbl = {
        "image" : torch.tensor(image, dtype = torch.float32),
        "label" : torch.tensor(label, dtype = torch.long),
        "hog": torch.tensor(hog_, dtype = torch.float32)
    }
    return img_hog_lbl
train_img = GetHogData(df = train_lab, root = "/content/train", transform = train_trans)
test_img = GetHogData(df = test_lab ,root = "/content/test", transform = test_trans)
在此之后,我将我的培训和验证转换定义为:

# Image mean and standard dev 

img_mean = [0.485, 0.456, 0.406]
img_std = [0.229, 0.224, 0.225]

train_trans = transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(img_mean, img_std)
    ])
        
test_trans = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(img_mean, img_std)
    ])
最后,我创建了如下加载程序:

class GetHogData(Dataset):

  def __init__(self, df, root, transform = None):
    self.df = df
    self.root = root
    self.transform = transform

  def __len__(self):
    return len(self.df)

  def __getitem__(self, idx):
    if torch.is_tensor(idx):
      idx = idx.tolist()

    img_path = os.path.join(self.root, self.df.iloc[idx, 0])
    # image = Image.open(img_path)
    image = cv2.imread(img_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    label = self.df.iloc[idx, 1]

    if self.transform:
      image = self.transform(image)

    hog_, hog_image = hog(
        image,
        orientations = 9,
        pixels_per_cell = (14,14),
        cells_per_block = (2,2),
        block_norm = "L1")
    
    image = np.transpose(image, (2, 0, 1))

    img_hog_lbl = {
        "image" : torch.tensor(image, dtype = torch.float32),
        "label" : torch.tensor(label, dtype = torch.long),
        "hog": torch.tensor(hog_, dtype = torch.float32)
    }
    return img_hog_lbl
train_img = GetHogData(df = train_lab, root = "/content/train", transform = train_trans)
test_img = GetHogData(df = test_lab ,root = "/content/test", transform = test_trans)
但是,当我尝试使用
test\u img[1]
预览训练图像时,我得到错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-132-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

5 frames
/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py in resize(img, size, interpolation)
    207 def resize(img, size, interpolation=Image.BILINEAR):
    208     if not _is_pil_image(img):
--> 209         raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
    210     if not (isinstance(size, int) or (isinstance(size, Sequence) and len(size) in (1, 2))):
    211         raise TypeError('Got inappropriate size arg: {}'.format(size))

TypeError: img should be PIL Image. Got <class 'numpy.ndarray'>
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-135-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

1 frames
<ipython-input-129-8551c2e76038> in __getitem__(self, idx)
     27         pixels_per_cell = (14,14),
     28         cells_per_block = (2,2),
---> 29         block_norm = "L1")
     30 
     31     image = np.transpose(image, (2, 0, 1))

/usr/local/lib/python3.7/dist-packages/skimage/feature/_hog.py in hog(image, orientations, pixels_per_cell, cells_per_block, block_norm, visualize, transform_sqrt, feature_vector, multichannel)
    273     n_blocks_col = (n_cells_col - b_col) + 1
    274     normalized_blocks = np.zeros((n_blocks_row, n_blocks_col,
--> 275                                   b_row, b_col, orientations))
    276 
    277     for r in range(n_blocks_row):

ValueError: negative dimensions are not allowed
但我得到了一个错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-132-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

5 frames
/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py in resize(img, size, interpolation)
    207 def resize(img, size, interpolation=Image.BILINEAR):
    208     if not _is_pil_image(img):
--> 209         raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
    210     if not (isinstance(size, int) or (isinstance(size, Sequence) and len(size) in (1, 2))):
    211         raise TypeError('Got inappropriate size arg: {}'.format(size))

TypeError: img should be PIL Image. Got <class 'numpy.ndarray'>
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-135-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

1 frames
<ipython-input-129-8551c2e76038> in __getitem__(self, idx)
     27         pixels_per_cell = (14,14),
     28         cells_per_block = (2,2),
---> 29         block_norm = "L1")
     30 
     31     image = np.transpose(image, (2, 0, 1))

/usr/local/lib/python3.7/dist-packages/skimage/feature/_hog.py in hog(image, orientations, pixels_per_cell, cells_per_block, block_norm, visualize, transform_sqrt, feature_vector, multichannel)
    273     n_blocks_col = (n_cells_col - b_col) + 1
    274     normalized_blocks = np.zeros((n_blocks_row, n_blocks_col,
--> 275                                   b_row, b_col, orientations))
    276 
    277     for r in range(n_blocks_row):

ValueError: negative dimensions are not allowed
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
---->1个测试样本[1]
1帧
in _uGetItem_uuu(self,idx)
每单元27像素=(14,14),
每个单元块28个单元=(2,2),
--->29块_norm=“L1”)
30
31图像=np.转置(图像,(2,0,1))
/usr/local/lib/python3.7/dist-packages/skimage/feature//u hog.py in hog(图像、方向、每个单元格的像素、每个块的单元格、块规范、可视化、变换sqrt、特征向量、多通道)
273 n_块_列=(n_单元_列-b_列)+1
274规范化的块=np.0((n块行,n块列,
-->275 b_行、b_列、方向)
276
277适用于范围内的r(n块):
ValueError:不允许负维度
有人有什么想法吗?提前谢谢

编辑-新错误:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-154-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

<ipython-input-151-8551c2e76038> in __getitem__(self, idx)
     27         pixels_per_cell = (14,14),
     28         cells_per_block = (2,2),
---> 29         block_norm = "L1")
     30 
     31     image = np.transpose(image, (2, 0, 1))

ValueError: too many values to unpack (expected 2)
---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
---->1个测试样本[1]
in _uGetItem_uuu(self,idx)
每单元27像素=(14,14),
每个单元块28个单元=(2,2),
--->29块_norm=“L1”)
30
31图像=np.转置(图像,(2,0,1))
ValueError:要解压缩的值太多(应为2个)

问题是,正如我在评论中所写的,略读要求数据是无阵列的,但你给它一个火炬张量,因此出现了错误

试试这个

    train_trans = transforms.Compose([
        transforms.ToPILImage(),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(img_mean, img_std),
        lambda x: np.rollaxis(x.numpy(), 0, 3)
    ])
编辑 这基本上是将输出转换为ndarray并更改通道轴

但正如你所看到的,这并不是解决问题的最佳方法,因为你必须先将PIL图像转换为张量,然后将张量转换为ndarray,然后再将ndarray转换回张量

例如,更好的方法是将PIL图像直接转换为ndarray,并对其进行规格化

在getitem中 而在转换中只是使用

    train_trans = transforms.Compose([
        transforms.ToPILImage(),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
    ])
编辑2 参考。您需要在
hog()
中添加
visualize=True
,或者删除
,hog\u image
。如果您不需要
hog_image
,则首选后者

    hog_, hog_image = hog(
        image, visualize=True,

“我曾尝试将
转换。ToPILImage()
添加到我的转换中,但幸运的是”如果您向我们展示您尝试的确切方式,我们只能告诉您尝试这样做时出现了什么问题。您好@KarlKnechtel,我已经更新了帖子,以展示我所得到的。谢谢,这显示了您“试图添加”该代码时的错误。我仍然没有看到“添加”该部分的代码版本。所以我仍然猜不出你是如何“试图添加”它的。我很抱歉,@KarlKnechtel-我现在已经包含了它,再次感谢上次编辑的数据源本身是正确的,
skimage
要求数据是标准的,但是你给了它一个火炬张量,因此errorHi@NatthaphonHongcharoen-谢谢你的建议,我相信它是有效的,但是我现在在()中得到了一个新的错误值error Traceback(最近一次调用last)--->1 test\u img[1]in\uuuu getitem\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuz(self,idx)每个单元27像素=(14,14),每个块28个单元=(2,2),->29块\unorm=“L1”)30 31 image=np.transpose(image,(2,0,1))value错误:太多的值无法解包(预期为2)在问题中,这太难理解了我的道歉,我已经添加了它@NatthaphonHongcharoen@ZedZee补充了答案非常感谢,@NatthaphonHongcharoen-问题确实是额外的hog_图像变量-再次感谢。