Python 类型错误：img应为PIL图像。得到<；类别'；numpy.ndarray和#x27>；-皮托克_Python_Numpy_Opencv_Pytorch

Python 类型错误：img应为PIL图像。得到<；类别'；numpy.ndarray和#x27>；-皮托克

python numpy opencv pytorch

Python 类型错误：img应为PIL图像。得到<；类别'；numpy.ndarray和#x27>；-皮托克,python,numpy,opencv,pytorch,Python,Numpy,Opencv,Pytorch,我正在准备一些图像数据，以便我的神经系统进行分类。作为图像预处理步骤的一部分，我将在我的dataset类中应用HOG过滤器，如下所示： class GetHogData(Dataset): def __init__(self, df, root, transform = None): self.df = df self.root = root self.transform = transform def __len__(self): return le

我正在准备一些图像数据，以便我的神经系统进行分类。作为图像预处理步骤的一部分，我将在我的dataset类中应用HOG过滤器，如下所示：

class GetHogData(Dataset):

  def __init__(self, df, root, transform = None):
    self.df = df
    self.root = root
    self.transform = transform

  def __len__(self):
    return len(self.df)

  def __getitem__(self, idx):
    if torch.is_tensor(idx):
      idx = idx.tolist()

    img_path = os.path.join(self.root, self.df.iloc[idx, 0])
    # image = Image.open(img_path)
    image = cv2.imread(img_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    label = self.df.iloc[idx, 1]

    if self.transform:
      image = self.transform(image)

    hog_, hog_image = hog(
        image,
        orientations = 9,
        pixels_per_cell = (14,14),
        cells_per_block = (2,2),
        block_norm = "L1")
    
    image = np.transpose(image, (2, 0, 1))

    img_hog_lbl = {
        "image" : torch.tensor(image, dtype = torch.float32),
        "label" : torch.tensor(label, dtype = torch.long),
        "hog": torch.tensor(hog_, dtype = torch.float32)
    }
    return img_hog_lbl

train_img = GetHogData(df = train_lab, root = "/content/train", transform = train_trans)
test_img = GetHogData(df = test_lab ,root = "/content/test", transform = test_trans)

在此之后，我将我的培训和验证转换定义为：

# Image mean and standard dev 

img_mean = [0.485, 0.456, 0.406]
img_std = [0.229, 0.224, 0.225]

train_trans = transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(img_mean, img_std)
    ])
        
test_trans = transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize(img_mean, img_std)
    ])

最后，我创建了如下加载程序：

class GetHogData(Dataset):

  def __init__(self, df, root, transform = None):
    self.df = df
    self.root = root
    self.transform = transform

  def __len__(self):
    return len(self.df)

  def __getitem__(self, idx):
    if torch.is_tensor(idx):
      idx = idx.tolist()

    img_path = os.path.join(self.root, self.df.iloc[idx, 0])
    # image = Image.open(img_path)
    image = cv2.imread(img_path)
    image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    label = self.df.iloc[idx, 1]

    if self.transform:
      image = self.transform(image)

    hog_, hog_image = hog(
        image,
        orientations = 9,
        pixels_per_cell = (14,14),
        cells_per_block = (2,2),
        block_norm = "L1")
    
    image = np.transpose(image, (2, 0, 1))

    img_hog_lbl = {
        "image" : torch.tensor(image, dtype = torch.float32),
        "label" : torch.tensor(label, dtype = torch.long),
        "hog": torch.tensor(hog_, dtype = torch.float32)
    }
    return img_hog_lbl

train_img = GetHogData(df = train_lab, root = "/content/train", transform = train_trans)
test_img = GetHogData(df = test_lab ,root = "/content/test", transform = test_trans)

但是，当我尝试使用

test\u img[1]

预览训练图像时，我得到错误：

TypeError                                 Traceback (most recent call last)
<ipython-input-132-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

5 frames
/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py in resize(img, size, interpolation)
    207 def resize(img, size, interpolation=Image.BILINEAR):
    208     if not _is_pil_image(img):
--> 209         raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
    210     if not (isinstance(size, int) or (isinstance(size, Sequence) and len(size) in (1, 2))):
    211         raise TypeError('Got inappropriate size arg: {}'.format(size))

TypeError: img should be PIL Image. Got <class 'numpy.ndarray'>

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-135-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

1 frames
<ipython-input-129-8551c2e76038> in __getitem__(self, idx)
     27         pixels_per_cell = (14,14),
     28         cells_per_block = (2,2),
---> 29         block_norm = "L1")
     30 
     31     image = np.transpose(image, (2, 0, 1))

/usr/local/lib/python3.7/dist-packages/skimage/feature/_hog.py in hog(image, orientations, pixels_per_cell, cells_per_block, block_norm, visualize, transform_sqrt, feature_vector, multichannel)
    273     n_blocks_col = (n_cells_col - b_col) + 1
    274     normalized_blocks = np.zeros((n_blocks_row, n_blocks_col,
--> 275                                   b_row, b_col, orientations))
    276 
    277     for r in range(n_blocks_row):

ValueError: negative dimensions are not allowed

但我得到了一个错误：

TypeError                                 Traceback (most recent call last)
<ipython-input-132-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

5 frames
/usr/local/lib/python3.7/dist-packages/torchvision/transforms/functional_pil.py in resize(img, size, interpolation)
    207 def resize(img, size, interpolation=Image.BILINEAR):
    208     if not _is_pil_image(img):
--> 209         raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
    210     if not (isinstance(size, int) or (isinstance(size, Sequence) and len(size) in (1, 2))):
    211         raise TypeError('Got inappropriate size arg: {}'.format(size))

TypeError: img should be PIL Image. Got <class 'numpy.ndarray'>

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-135-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

1 frames
<ipython-input-129-8551c2e76038> in __getitem__(self, idx)
     27         pixels_per_cell = (14,14),
     28         cells_per_block = (2,2),
---> 29         block_norm = "L1")
     30 
     31     image = np.transpose(image, (2, 0, 1))

/usr/local/lib/python3.7/dist-packages/skimage/feature/_hog.py in hog(image, orientations, pixels_per_cell, cells_per_block, block_norm, visualize, transform_sqrt, feature_vector, multichannel)
    273     n_blocks_col = (n_cells_col - b_col) + 1
    274     normalized_blocks = np.zeros((n_blocks_row, n_blocks_col,
--> 275                                   b_row, b_col, orientations))
    276 
    277     for r in range(n_blocks_row):

ValueError: negative dimensions are not allowed

---------------------------------------------------------------------------
ValueError回溯（最近一次调用上次）
在（）
---->1个测试样本[1]
1帧
in _uGetItem_uuu（self，idx）
每单元27像素=（14,14），
每个单元块28个单元=（2,2），
--->29块_norm=“L1”）
30
31图像=np.转置（图像，（2,0,1））
/usr/local/lib/python3.7/dist-packages/skimage/feature//u hog.py in hog（图像、方向、每个单元格的像素、每个块的单元格、块规范、可视化、变换sqrt、特征向量、多通道）
273 n_块_列=（n_单元_列-b_列）+1
274规范化的块=np.0（（n块行，n块列，
-->275 b_行、b_列、方向）
276
277适用于范围内的r（n块）：
ValueError:不允许负维度

有人有什么想法吗？提前谢谢

编辑-新错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-154-b9a9394eb1e0> in <module>()
----> 1 test_img[1]

<ipython-input-151-8551c2e76038> in __getitem__(self, idx)
     27         pixels_per_cell = (14,14),
     28         cells_per_block = (2,2),
---> 29         block_norm = "L1")
     30 
     31     image = np.transpose(image, (2, 0, 1))

ValueError: too many values to unpack (expected 2)

---------------------------------------------------------------------------
ValueError回溯（最近一次调用上次）
在（）
---->1个测试样本[1]
in _uGetItem_uuu（self，idx）
每单元27像素=（14,14），
每个单元块28个单元=（2,2），
--->29块_norm=“L1”）
30
31图像=np.转置（图像，（2,0,1））
ValueError:要解压缩的值太多（应为2个）

问题是，正如我在评论中所写的，略读要求数据是无阵列的，但你给它一个火炬张量，因此出现了错误

试试这个

    train_trans = transforms.Compose([
        transforms.ToPILImage(),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize(img_mean, img_std),
        lambda x: np.rollaxis(x.numpy(), 0, 3)
    ])

编辑这基本上是将输出转换为ndarray并更改通道轴

但正如你所看到的，这并不是解决问题的最佳方法，因为你必须先将PIL图像转换为张量，然后将张量转换为ndarray，然后再将ndarray转换回张量

例如，更好的方法是将PIL图像直接转换为ndarray，并对其进行规格化

在getitem中而在转换中只是使用

    train_trans = transforms.Compose([
        transforms.ToPILImage(),
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
    ])

编辑2 参考。您需要在

hog（）

中添加

visualize=True

，或者删除

，hog\u image

。如果您不需要

hog_image

，则首选后者

    hog_, hog_image = hog(
        image, visualize=True,

“我曾尝试将

转换。ToPILImage（）

添加到我的转换中，但幸运的是”如果您向我们展示您尝试的确切方式，我们只能告诉您尝试这样做时出现了什么问题。您好@KarlKnechtel，我已经更新了帖子，以展示我所得到的。谢谢，这显示了您“试图添加”该代码时的错误。我仍然没有看到“添加”该部分的代码版本。所以我仍然猜不出你是如何“试图添加”它的。我很抱歉，@KarlKnechtel-我现在已经包含了它，再次感谢上次编辑的数据源本身是正确的，

skimage

要求数据是标准的，但是你给了它一个火炬张量，因此errorHi@NatthaphonHongcharoen-谢谢你的建议，我相信它是有效的，但是我现在在（）中得到了一个新的错误值error Traceback（最近一次调用last）--->1 test\u img[1]in\uuuu getitem\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuz（self，idx）每个单元27像素=（14,14），每个块28个单元=（2,2），->29块\unorm=“L1”)30 31 image=np.transpose（image，（2，0，1））value错误：太多的值无法解包（预期为2）在问题中，这太难理解了我的道歉，我已经添加了它@NatthaphonHongcharoen@ZedZee补充了答案非常感谢，@NatthaphonHongcharoen-问题确实是额外的hog_图像变量-再次感谢。