Pytorch 更改图像大小和范围_Pytorch_Transform_Normalization_Image Resizing

Pytorch 更改图像大小和范围

pytorch

Pytorch 更改图像大小和范围,pytorch,transform,normalization,image-resizing,Pytorch,Transform,Normalization,Image Resizing,我想要求进行数据转换，如果我有一个大小为28*28的图像，并且我想将其调整为32*32，我知道这可以通过transforms.resize（）完成，但我确定如何进行同样对于规范化，如果我希望它在[-1,1]的范围内，我之前使用变换在[0,1]的范围内进行过规范化。规范化（（0.485,0.456,0.406），（0.229,0.224,0.225））此转换获取所需的输出形状作为构造函数的参数： transform.Resize（（32,32））既然Normalizetransforma

我想要求进行数据转换，如果我有一个大小为28*28的图像，并且我想将其调整为32*32，我知道这可以通过

transforms.resize（）

完成，但我确定如何进行

同样对于规范化，如果我希望它在[-1,1]的范围内，我之前使用

变换在[0,1]的范围内进行过规范化。规范化（（0.485,0.456,0.406），（0.229,0.224,0.225））

此转换获取所需的输出形状作为构造函数的参数：

transform.Resize（（32,32））

既然
Normalize
transformation像
out一样工作，不要生气，那就没事了。将MNIST 的大小调整为32x32 高度x宽度可以这样做： import tempfile import torchvision dataset = torchvision.datasets.MNIST( root=tempfile.gettempdir(), download=True, train=True, # Simply put the size you want in Resize (can be tuple for height, width) transform=torchvision.transforms.Compose( [torchvision.transforms.Resize(32), torchvision.transforms.ToTensor()] ), ) print(dataset[0][0].shape) # 1, 32, 32 (channels, width, height) 谈到规范化，您可以看到PyTorch的每通道规范化源。这取决于您是希望每个通道使用它，还是希望以其他形式使用它，但沿着这些路线应该可以工作（请参阅规范化公式，这里它是针对每个通道应用的）：您必须提供最小值的Tuple 和最大值的Tuple （两个通道各一个值），就像标准PyTorch的torchvision 标准化一样。您可以根据数据计算，对于MNIST，您可以这样计算： def per_channel_op(data, op=torch.max): per_sample, _ = op(data, axis=0) per_width, _ = op(per_sample, axis=1) per_height, _ = op(per_width, axis=1) return per_height # Unsqueeze to add superficial channel for MNIST # Divide cause they are uint8 type by default data = dataset.data.unsqueeze(1).float() / 255 # Maximum over samples maximum = per_channel_op(data) # value per channel, here minimum = per_channel_op(data, op=torch.min) # only one value cause MNIST 最后，要在MNIST上应用规范化（注意，因为那些只有-1 ，1 值，因为所有像素都是黑白的，所以在CIFAR等数据集上的作用不同）：对于标准化，这适用于3个通道；如果我希望它用于1个通道，即变换。在相同的条件下规范化（（X，），（Y，）值范围在[-1,1]之间，这是有意义的！ import dataclasses @dataclasses.dataclass class Normalize: maximum: typing.Tuple minimum: typing.Tuple low: int = -1 high: int = 1 def __call__(self, tensor): maximum = torch.as_tensor(self.maximum, dtype=dtype, device=tensor.device) minimum = torch.as_tensor(self.minimum, dtype=dtype, device=tensor.device) return self.low + ( (tensor - minimum[:, None, None]) * (self.high - self.low) ) / (maximum[:, None, None] - minimum[:, None, None]) def per_channel_op(data, op=torch.max): per_sample, _ = op(data, axis=0) per_width, _ = op(per_sample, axis=1) per_height, _ = op(per_width, axis=1) return per_height # Unsqueeze to add superficial channel for MNIST # Divide cause they are uint8 type by default data = dataset.data.unsqueeze(1).float() / 255 # Maximum over samples maximum = per_channel_op(data) # value per channel, here minimum = per_channel_op(data, op=torch.min) # only one value cause MNIST dataset = torchvision.datasets.MNIST( root=tempfile.gettempdir(), download=True, train=True, # Simply put the size you want in Resize (can be tuple for height, width) transform=torchvision.transforms.Compose( [ torchvision.transforms.Resize(32), torchvision.transforms.ToTensor(), # Apply with Lambda your custom transformation torchvision.transforms.Lambda(Normalize((maximum,), (minimum,))), ] ), )