Python 3.x 至Vgg的Pytork灰度输入_Python 3.x_Pytorch

Python 3.x 至Vgg的Pytork灰度输入

python-3.x pytorch

Python 3.x 至Vgg的Pytork灰度输入,python-3.x,pytorch,Python 3.x,Pytorch,我是pytorch的新手，我想使用Vgg进行迁移学习。我想删除完全连接的层并添加一些新的完全连接的层。我想使用灰度输入，而不是RGB输入。为此，我将添加输入层的权重并获得单个权重。因此，将添加三个通道的权重我实现了删除完全连接的层，但我有问题的灰度部分。我将这三个权重相加，形成一个新的权重。然后，我尝试更改vgg模型的状态dict，但这给了我错误。网络代码如下： class Net(nn.Module): def __init__(self): super(Net,

我是pytorch的新手，我想使用Vgg进行迁移学习。我想删除完全连接的层并添加一些新的完全连接的层。我想使用灰度输入，而不是RGB输入。为此，我将添加输入层的权重并获得单个权重。因此，将添加三个通道的权重

我实现了删除完全连接的层，但我有问题的灰度部分。我将这三个权重相加，形成一个新的权重。然后，我尝试更改vgg模型的状态dict，但这给了我错误。网络代码如下：

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
    vgg=models.vgg16(pretrained = True).features[:30]

    w1=vgg.state_dict()['0.weight'][:,0,:,:] #first channel of first input layer's weight
    w2=vgg.state_dict()['0.weight'][:,1,:,:]
    w3=vgg.state_dict()['0.weight'][:,2,:,:]
    w4=w1+w2+w3 # add the three weigths of the channels
    w4=w4.unsqueeze(1) # make it 4 dimensional

    a=vgg.state_dict()#create a new statedict
    a['0.weight']=w4 #replace the new state dict's weigt

    vgg.load_state_dict(a) # this line gives the error,load the new state dict

    self.vgg =nn.Sequential(vgg)
    self.fc1 = nn.Linear(14*14*512, 1000)
    self.fc2 = nn.Linear(1000, 2)

def forward(self, x):
    x = self.vgg(x)
    x = x.view(-1, 14 * 14 * 512)
    x = F.relu(self.fc1(x))
    x = self.fc2(x)
    return x

这会产生以下错误：

RuntimeError:Sequential:size的加载状态下的错误 0.weight不匹配：使用shape torch.Size（[64，1， 3，3]），当前模型中的形状为火炬大小（[64， 3、3、3]）

因此，它不允许我用不同尺寸的砝码替换砝码。有没有解决这个问题的办法，或者我可以尝试其他方法。我所要做的就是使用vgg的层到完全连接的层，并更改第一层的权重。

您尚未指定

vgg

类的来源，但我假设它来自

torchvision.models

VGG模型是为具有3个通道的图像创建的。你可以在地图上看到这一点

在torchvision包中修改代码可能不是一个好主意，但您可以在项目中创建一个副本，并使

在\u通道中可设置。
简而言之：误差是由预训练模型参数与vgg模型之间的不匹配引起的
原因：您通过添加将预训练模型中的参数从[64,3,3]->[64,1,3,3]修改为[64,3,3]，但没有改变VGG的结构，它仍然需要[64,3,3]的输入形状
解决方案：删除VGG结构的第一个卷积层，并添加一个新的卷积层，使其适合您输入的形状
例如，在Kaggle上执行的一个技巧是，在VGG输入之前，只需使用一个/两个附加层，即可将通道扩展到所需数量（本例中为3个）
比修改原始模型更健壮，因为它允许您更轻松地更改预训练主干。
我通过包括一个新的conv层并使用一个权重初始化它的权重，该权重是Vgg的第一个conv层三个通道权重之和，从而解决了这个问题。然后我排除了Vgg的第一个conv层
class Net(nn.Module):
def __init__(self):
    super(Net, self).__init__()

    vgg_firstlayer=models.vgg16(pretrained = True).features[0] #load just the first conv layer
    vgg=models.vgg16(pretrained = True).features[1:30] #load upto the classification layers except first conv layer

    w1=vgg_firstlayer.state_dict()['weight'][:,0,:,:]
    w2=vgg_firstlayer.state_dict()['weight'][:,1,:,:]
    w3=vgg_firstlayer.state_dict()['weight'][:,2,:,:]
    w4=w1+w2+w3 # add the three weigths of the channels
    w4=w4.unsqueeze(1)# make it 4 dimensional


    first_conv=nn.Conv2d(1, 64, 3, padding = (1,1)) #create a new conv layer
    first_conv.weigth=torch.nn.Parameter(w4, requires_grad=True) #initialize  the conv layer's weigths with w4
    first_conv.bias=torch.nn.Parameter(vgg_firstlayer.state_dict()['bias'], requires_grad=True) #initialize  the conv layer's weigths with vgg's first conv bias


    self.first_convlayer=first_conv #the first layer is 1 channel (Grayscale) conv  layer
    self.vgg =nn.Sequential(vgg)

    self.fc1 = nn.Linear(7*7*512, 1000)
    self.fc2 = nn.Linear(1000, 2)

def forward(self, x):
    x=self.first_convlayer(x)
    x = self.vgg(x)

    x = x.view(-1, 7 * 7 * 512)
    x = F.relu(self.fc1(x))

    x = self.fc2(x)
    return x