Neural network MNIST上的二元分类:损失和准确度仍然是共同的
我正在尝试对MNIST数据集进行二进制分类。0类表示偶数,1类表示奇数。我使用的是VGG的简化版本。 我的神经网络有一个损失和准确性,仍然是共同的。 我想说的是,我的模型,在将目标转换为二进制目标之前,已经达到了90%以上的准确度,所以可能是出了问题。 在这里,我将目标更改为二进制:Neural network MNIST上的二元分类:损失和准确度仍然是共同的,neural-network,pytorch,conv-neural-network,mnist,image-classification,Neural Network,Pytorch,Conv Neural Network,Mnist,Image Classification,我正在尝试对MNIST数据集进行二进制分类。0类表示偶数,1类表示奇数。我使用的是VGG的简化版本。 我的神经网络有一个损失和准确性,仍然是共同的。 我想说的是,我的模型,在将目标转换为二进制目标之前,已经达到了90%以上的准确度,所以可能是出了问题。 在这里,我将目标更改为二进制: for i in range(10): idx = (train_set.targets==i) if (i == 0) or ((i % 2) == 0): train_set.targets[idx]
for i in range(10):
idx = (train_set.targets==i)
if (i == 0) or ((i % 2) == 0): train_set.targets[idx] = 0
else: train_set.targets[idx] = 1
for i in range(10):
idx = (test_set.targets==i)
if (i == 0) or ((i % 2) == 0): test_set.targets[idx] = 0
else: test_set.targets[idx] = 1
这是我的网:
class VGG16(nn.Module):
def __init__(self, num_classes):
super(VGG16, self).__init__()
# calculate same padding:
# (w - k + 2*p)/s + 1 = o
# => p = (s(o-1) - w + k)/2
self.block_1 = nn.Sequential(
nn.Conv2d(in_channels=1,
out_channels=64,
kernel_size=(3, 3),
stride=(1, 1),
# (1(32-1)- 32 + 3)/2 = 1
padding=1),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.Conv2d(in_channels=64,
out_channels=64,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2))
)
self.block_2 = nn.Sequential(
nn.Conv2d(in_channels=64,
out_channels=128,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(128),
nn.ReLU(),
nn.Conv2d(in_channels=128,
out_channels=128,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(128),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2))
)
self.block_3 = nn.Sequential(
nn.Conv2d(in_channels=128,
out_channels=256,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(256),
nn.ReLU(),
nn.Conv2d(in_channels=256,
out_channels=256,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(256),
nn.ReLU(),
nn.Conv2d(in_channels=256,
out_channels=256,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(256),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2))
)
self.block_4 = nn.Sequential(
nn.Conv2d(in_channels=256,
out_channels=512,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(512),
nn.ReLU(),
nn.Conv2d(in_channels=512,
out_channels=512,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(512),
nn.ReLU(),
nn.Conv2d(in_channels=512,
out_channels=512,
kernel_size=(3, 3),
stride=(1, 1),
padding=1),
nn.BatchNorm2d(512),
nn.ReLU(),
nn.MaxPool2d(kernel_size=(2, 2),
stride=(2, 2))
)
self.classifier = nn.Sequential(
nn.Linear(2048, 4096),
nn.ReLU(True),
nn.Dropout(p=0.65),
nn.Linear(4096, 4096),
nn.ReLU(True),
nn.Dropout(p=0.65),
nn.Linear(4096, num_classes),
nn.Sigmoid()
)
for m in self.modules():
if isinstance(m, torch.nn.Conv2d) or isinstance(m, torch.nn.Linear):
nn.init.kaiming_uniform_(m.weight, mode='fan_in', nonlinearity='leaky_relu')
# nn.init.xavier_normal_(m.weight)
if m.bias is not None:
m.bias.detach().zero_()
# self.avgpool = nn.AdaptiveAvgPool2d((7, 7))
def forward(self, x):
x = self.block_1(x)
x = self.block_2(x)
x = self.block_3(x)
x = self.block_4(x)
# x = self.avgpool(x)
x = x.view(x.size(0), -1)
x = self.classifier(x)
return x
#logits = self.classifier(x)
#probas = F.softmax(logits, dim=1)
# probas = nn.Softmax(logits)
#return probas
# return logits
从以前的数字识别模型中,我只改变了目标,最后一层的分类器从10个类改为1个类+Sigmoid。我也把交叉熵改成了B熵。我做错了什么
这些是损失和精度值:
Epoch 1: TrL=49.0955, TrA=31.4211, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 2: TrL=49.0992, TrA=31.4235, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 3: TrL=49.0899, TrA=31.4176, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 4: TrL=49.0936, TrA=31.4199, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 5: TrL=49.0936, TrA=31.4199, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 6: TrL=49.0825, TrA=31.4128, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
怎么了?10门课的准确率超过90%,而简化版只有2门课的准确率达到30%,这怎么可能呢
编辑:将批量大小从64增加到128,精确度达到60%并保持不变…在我看来,问题在于奇数和偶数的不同表示形式。让我们拍1张,3张这个数字的照片是各式各样的,卷积神经网络在提取特征方面有问题。神经网络有90%的准确度,有10个类,所以为什么需要将其转换为2个类。如果你知道这个数字是1,3,5,7,9,你就知道它是奇数。因为我正在做一个迁移学习的项目,为了简单起见,我想让2个神经网络进行二进制分类。这两个我都想要至少70%的准确率。MNIST是60000个样本,在这个数据集上进行二值分类我认为比进行多类分类更简单。大多数情况下,学习二值分类更简单,但在这个问题中,一类中有5种不同类型的图片。i、 如果你有狗和猫,二元分类比多重分类更简单,因为猫和狗更相似(耳朵,口吻)。基本神经网络在这组数据中具有高精度。所以如果更复杂的网络有更低的,我认为问题只存在于2类。
Epoch 1: TrL=49.0955, TrA=31.4211, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 2: TrL=49.0992, TrA=31.4235, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 3: TrL=49.0899, TrA=31.4176, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 4: TrL=49.0936, TrA=31.4199, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 5: TrL=49.0936, TrA=31.4199, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,
Epoch 6: TrL=49.0825, TrA=31.4128, VL=49.7285, VA=31.7340, TeL=49.2635, TeA=31.3758,