Pytorch 更改图像大小和范围

Pytorch 更改图像大小和范围,pytorch,transform,normalization,image-resizing,Pytorch,Transform,Normalization,Image Resizing,我想要求进行数据转换,如果我有一个大小为28*28的图像,并且我想将其调整为32*32,我知道这可以通过transforms.resize()完成,但我确定如何进行 同样对于规范化,如果我希望它在[-1,1]的范围内,我之前使用变换在[0,1]的范围内进行过规范化。规范化((0.485,0.456,0.406),(0.229,0.224,0.225)) 此转换获取所需的输出形状作为构造函数的参数: transform.Resize((32,32)) 既然Normalizetransforma

我想要求进行数据转换,如果我有一个大小为28*28的图像,并且我想将其调整为32*32,我知道这可以通过
transforms.resize()
完成,但我确定如何进行

同样对于规范化,如果我希望它在[-1,1]的范围内,我之前使用
变换在[0,1]的范围内进行过规范化。规范化((0.485,0.456,0.406),(0.229,0.224,0.225))

此转换获取所需的输出形状作为构造函数的参数:

transform.Resize((32,32))


既然
Normalize
transformation像
out一样工作,不要生气,那就没事了。将
MNIST
的大小调整为
32x32
高度x宽度
可以这样做:

import tempfile

import torchvision

dataset = torchvision.datasets.MNIST(
    root=tempfile.gettempdir(),
    download=True,
    train=True,
    # Simply put the size you want in Resize (can be tuple for height, width)
    transform=torchvision.transforms.Compose(
        [torchvision.transforms.Resize(32), torchvision.transforms.ToTensor()]
    ),
)

print(dataset[0][0].shape) # 1, 32, 32 (channels, width, height)
谈到规范化,您可以看到PyTorch的每通道规范化源。这取决于您是希望每个通道使用它,还是希望以其他形式使用它,但沿着这些路线应该可以工作(请参阅规范化公式,这里它是针对每个通道应用的):

您必须提供最小值的
Tuple
和最大值的
Tuple
(两个通道各一个值),就像标准PyTorch的
torchvision
标准化一样。您可以根据数据计算,对于MNIST,您可以这样计算:

def per_channel_op(data, op=torch.max):
    per_sample, _ = op(data, axis=0)
    per_width, _ = op(per_sample, axis=1)
    per_height, _ = op(per_width, axis=1)
    return per_height

# Unsqueeze to add superficial channel for MNIST
# Divide cause they are uint8 type by default
data = dataset.data.unsqueeze(1).float() / 255

# Maximum over samples
maximum = per_channel_op(data) # value per channel, here
minimum = per_channel_op(data, op=torch.min) # only one value cause MNIST
最后,要在MNIST上应用规范化(注意,因为那些只有
-1
1
值,因为所有像素都是黑白的,所以在CIFAR等数据集上的作用不同):


对于标准化,这适用于3个通道;如果我希望它用于1个通道,即变换。在相同的条件下规范化((X,),(Y,)值范围在[-1,1]之间,这是有意义的!
import dataclasses


@dataclasses.dataclass
class Normalize:
    maximum: typing.Tuple
    minimum: typing.Tuple
    low: int = -1
    high: int = 1

    def __call__(self, tensor):
        maximum = torch.as_tensor(self.maximum, dtype=dtype, device=tensor.device)
        minimum = torch.as_tensor(self.minimum, dtype=dtype, device=tensor.device)
        return self.low + (
            (tensor - minimum[:, None, None]) * (self.high - self.low)
        ) / (maximum[:, None, None] - minimum[:, None, None])
def per_channel_op(data, op=torch.max):
    per_sample, _ = op(data, axis=0)
    per_width, _ = op(per_sample, axis=1)
    per_height, _ = op(per_width, axis=1)
    return per_height

# Unsqueeze to add superficial channel for MNIST
# Divide cause they are uint8 type by default
data = dataset.data.unsqueeze(1).float() / 255

# Maximum over samples
maximum = per_channel_op(data) # value per channel, here
minimum = per_channel_op(data, op=torch.min) # only one value cause MNIST
dataset = torchvision.datasets.MNIST(
    root=tempfile.gettempdir(),
    download=True,
    train=True,
    # Simply put the size you want in Resize (can be tuple for height, width)
    transform=torchvision.transforms.Compose(
        [
            torchvision.transforms.Resize(32),
            torchvision.transforms.ToTensor(),
            # Apply with Lambda your custom transformation
            torchvision.transforms.Lambda(Normalize((maximum,), (minimum,))),
        ]
    ),
)