Deep learning 尝试在灰度图像上使用自定义主干训练FastErrorCNN时出错

Deep learning 尝试在灰度图像上使用自定义主干训练FastErrorCNN时出错,deep-learning,pytorch,object-detection,faster-rcnn,Deep Learning,Pytorch,Object Detection,Faster Rcnn,为了在灰度图像上为1类创建对象检测器,我按照教程中的说明进行操作 以下是我的代码(请注意,我在自己的数据集上使用DenseNet作为主干-预训练模型): 这是我遇到的错误: RuntimeError:给定的组=1,大小的权重[64,1,7,7],预期输入[2,3,1344,800]有1个通道,但得到了3个通道 基于FasterRCNN体系结构,我假设问题出在transform组件中,因为它试图规范化最初为灰度而非RGB的图像: FasterRCNN( (transform): General

为了在灰度图像上为1类创建对象检测器,我按照教程中的说明进行操作

以下是我的代码(请注意,我在自己的数据集上使用DenseNet作为主干-预训练模型):

这是我遇到的错误:

RuntimeError:给定的组=1,大小的权重[64,1,7,7],预期输入[2,3,1344,800]有1个通道,但得到了3个通道

基于FasterRCNN体系结构,我假设问题出在
transform
组件中,因为它试图规范化最初为灰度而非RGB的图像:

FasterRCNN(
  (transform): GeneralizedRCNNTransform(
      Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
      Resize(min_size=(800,), max_size=1333, mode='bilinear')
  )
  (backbone): Sequential(
    (conv0): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu0): ReLU(inplace=True)
    (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (denseblock1): _DenseBlock(
      (denselayer1): _DenseLayer(
        (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU(inplace=True)
        (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU(inplace=True)
        (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      )
      
      ...............
        
    (norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (rpn): RegionProposalNetwork(
    (anchor_generator): AnchorGenerator()
    (head): RPNHead(
      (conv): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (cls_logits): Conv2d(1024, 15, kernel_size=(1, 1), stride=(1, 1))
      (bbox_pred): Conv2d(1024, 60, kernel_size=(1, 1), stride=(1, 1))
    )
  )
  (roi_heads): RoIHeads(
    (box_roi_pool): MultiScaleRoIAlign()
    (box_head): TwoMLPHead(
      (fc6): Linear(in_features=50176, out_features=1024, bias=True)
      (fc7): Linear(in_features=1024, out_features=1024, bias=True)
    )
    (box_predictor): FastRCNNPredictor(
      (cls_score): Linear(in_features=1024, out_features=2, bias=True)
      (bbox_pred): Linear(in_features=1024, out_features=8, bias=True)
    )
  )
)
我说得对吗?如果是,我如何解决此问题?是否有处理灰度图像和FasterRCNN的标准实践

提前谢谢!真的很感激

标准化(平均值=[0.485,0.456,0.406],标准值=[0.229,0.224,0.225])
表示标准化过程应用于输入图像的所有3个通道
0.485
应用于R通道,
0.456
应用于G通道,
0.406
应用于B通道。标准偏差值也是如此

主干网的第一个转换层需要一个单通道输入,这就是您得到此错误的原因

您可以执行以下操作来解决此问题

重新定义GeneralizedRCNNTransform并将其附加到模型。你可以这样做:

#将这些部件组合在一个FasterRCNN模型中
模型=FasterRCNN(主干网,num\u类=2,rpn\u锚定\u生成器=锚定\u生成器,盒\u roi\u池=roi\u池)
#变化
grcnn=torchvision.models.detection.transform.GeneralizedRCNNTransform(最小尺寸=800,最大尺寸=1333,图像平均值=0.485,图像标准=0.229)
model.transform=grcnn
型号.至(设备)
FasterRCNN(
  (transform): GeneralizedRCNNTransform(
      Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
      Resize(min_size=(800,), max_size=1333, mode='bilinear')
  )
  (backbone): Sequential(
    (conv0): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu0): ReLU(inplace=True)
    (pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (denseblock1): _DenseBlock(
      (denselayer1): _DenseLayer(
        (norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu1): ReLU(inplace=True)
        (conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu2): ReLU(inplace=True)
        (conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      )
      
      ...............
        
    (norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (rpn): RegionProposalNetwork(
    (anchor_generator): AnchorGenerator()
    (head): RPNHead(
      (conv): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (cls_logits): Conv2d(1024, 15, kernel_size=(1, 1), stride=(1, 1))
      (bbox_pred): Conv2d(1024, 60, kernel_size=(1, 1), stride=(1, 1))
    )
  )
  (roi_heads): RoIHeads(
    (box_roi_pool): MultiScaleRoIAlign()
    (box_head): TwoMLPHead(
      (fc6): Linear(in_features=50176, out_features=1024, bias=True)
      (fc7): Linear(in_features=1024, out_features=1024, bias=True)
    )
    (box_predictor): FastRCNNPredictor(
      (cls_score): Linear(in_features=1024, out_features=2, bias=True)
      (bbox_pred): Linear(in_features=1024, out_features=8, bias=True)
    )
  )
)