Pytorch 更改图像大小和范围
我想要求进行数据转换,如果我有一个大小为28*28的图像,并且我想将其调整为32*32,我知道这可以通过Pytorch 更改图像大小和范围,pytorch,transform,normalization,image-resizing,Pytorch,Transform,Normalization,Image Resizing,我想要求进行数据转换,如果我有一个大小为28*28的图像,并且我想将其调整为32*32,我知道这可以通过transforms.resize()完成,但我确定如何进行 同样对于规范化,如果我希望它在[-1,1]的范围内,我之前使用变换在[0,1]的范围内进行过规范化。规范化((0.485,0.456,0.406),(0.229,0.224,0.225)) 此转换获取所需的输出形状作为构造函数的参数: transform.Resize((32,32)) 既然Normalizetransforma
transforms.resize()
完成,但我确定如何进行
同样对于规范化,如果我希望它在[-1,1]的范围内,我之前使用变换在[0,1]的范围内进行过规范化。规范化((0.485,0.456,0.406),(0.229,0.224,0.225))
此转换获取所需的输出形状作为构造函数的参数:
transform.Resize((32,32))
既然
Normalize
transformation像out一样工作,不要生气,那就没事了。将MNIST
的大小调整为32x32
高度x宽度
可以这样做:
import tempfile
import torchvision
dataset = torchvision.datasets.MNIST(
root=tempfile.gettempdir(),
download=True,
train=True,
# Simply put the size you want in Resize (can be tuple for height, width)
transform=torchvision.transforms.Compose(
[torchvision.transforms.Resize(32), torchvision.transforms.ToTensor()]
),
)
print(dataset[0][0].shape) # 1, 32, 32 (channels, width, height)
谈到规范化,您可以看到PyTorch的每通道规范化源。这取决于您是希望每个通道使用它,还是希望以其他形式使用它,但沿着这些路线应该可以工作(请参阅规范化公式,这里它是针对每个通道应用的):
您必须提供最小值的Tuple
和最大值的Tuple
(两个通道各一个值),就像标准PyTorch的torchvision
标准化一样。您可以根据数据计算,对于MNIST,您可以这样计算:
def per_channel_op(data, op=torch.max):
per_sample, _ = op(data, axis=0)
per_width, _ = op(per_sample, axis=1)
per_height, _ = op(per_width, axis=1)
return per_height
# Unsqueeze to add superficial channel for MNIST
# Divide cause they are uint8 type by default
data = dataset.data.unsqueeze(1).float() / 255
# Maximum over samples
maximum = per_channel_op(data) # value per channel, here
minimum = per_channel_op(data, op=torch.min) # only one value cause MNIST
最后,要在MNIST上应用规范化(注意,因为那些只有-1
,1
值,因为所有像素都是黑白的,所以在CIFAR等数据集上的作用不同):
对于标准化,这适用于3个通道;如果我希望它用于1个通道,即变换。在相同的条件下规范化((X,),(Y,)值范围在[-1,1]之间,这是有意义的!
import dataclasses
@dataclasses.dataclass
class Normalize:
maximum: typing.Tuple
minimum: typing.Tuple
low: int = -1
high: int = 1
def __call__(self, tensor):
maximum = torch.as_tensor(self.maximum, dtype=dtype, device=tensor.device)
minimum = torch.as_tensor(self.minimum, dtype=dtype, device=tensor.device)
return self.low + (
(tensor - minimum[:, None, None]) * (self.high - self.low)
) / (maximum[:, None, None] - minimum[:, None, None])
def per_channel_op(data, op=torch.max):
per_sample, _ = op(data, axis=0)
per_width, _ = op(per_sample, axis=1)
per_height, _ = op(per_width, axis=1)
return per_height
# Unsqueeze to add superficial channel for MNIST
# Divide cause they are uint8 type by default
data = dataset.data.unsqueeze(1).float() / 255
# Maximum over samples
maximum = per_channel_op(data) # value per channel, here
minimum = per_channel_op(data, op=torch.min) # only one value cause MNIST
dataset = torchvision.datasets.MNIST(
root=tempfile.gettempdir(),
download=True,
train=True,
# Simply put the size you want in Resize (can be tuple for height, width)
transform=torchvision.transforms.Compose(
[
torchvision.transforms.Resize(32),
torchvision.transforms.ToTensor(),
# Apply with Lambda your custom transformation
torchvision.transforms.Lambda(Normalize((maximum,), (minimum,))),
]
),
)