Pytorch 如何从torchvision.datasets.CIFAR10中仅提取类的子集?
如何从torchvision.datasets.CIFAR10中仅提取2或3个类 加载所有10个类的标准方式Pytorch 如何从torchvision.datasets.CIFAR10中仅提取类的子集?,pytorch,Pytorch,如何从torchvision.datasets.CIFAR10中仅提取2或3个类 加载所有10个类的标准方式 transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))]) trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=Tru
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=2)
testset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=4, shuffle=False, num_workers=2)
通过检查CIFAR10
,可以看到数据存储为numpy
数组,标签存储为列表。因此,您可以将其子类化,并充分过滤这两个数组。一个例子如下:
class SubLoader(torchvision.datasets.CIFAR10):
def __init__(self, *args, exclude_list=[], **kwargs):
super(SubLoader, self).__init__(*args, **kwargs)
if exclude_list == []:
return
if self.train:
labels = np.array(self.train_labels)
exclude = np.array(exclude_list).reshape(1, -1)
mask = ~(labels.reshape(-1, 1) == exclude).any(axis=1)
self.train_data = self.train_data[mask]
self.train_labels = labels[mask].tolist()
else:
labels = np.array(self.test_labels)
exclude = np.array(exclude_list).reshape(1, -1)
mask = ~(labels.reshape(-1, 1) == exclude).any(axis=1)
self.test_data = self.test_data[mask]
self.test_labels = labels[mask].tolist()
super(SubLoader,self)。\uuuu init\uuu(*args,**kwargs)类型错误:\uuu init\uuugs()缺少1个必需的位置参数:“root”如何实例化它?
loader=SubLoader(root='./data',train=True,download=True)
和loader=SubLoader('./data',train=True,download=True)
都应该可以工作。由@Jatentaki提供的解决方案适用于旧版本的torchvision。从0.2.2开始,可以简单地使用self.data
和self.targets
,而不区分大小写