Deep learning 尝试在灰度图像上使用自定义主干训练FastErrorCNN时出错
为了在灰度图像上为1类创建对象检测器,我按照教程中的说明进行操作 以下是我的代码(请注意,我在自己的数据集上使用DenseNet作为主干-预训练模型): 这是我遇到的错误:Deep learning 尝试在灰度图像上使用自定义主干训练FastErrorCNN时出错,deep-learning,pytorch,object-detection,faster-rcnn,Deep Learning,Pytorch,Object Detection,Faster Rcnn,为了在灰度图像上为1类创建对象检测器,我按照教程中的说明进行操作 以下是我的代码(请注意,我在自己的数据集上使用DenseNet作为主干-预训练模型): 这是我遇到的错误: RuntimeError:给定的组=1,大小的权重[64,1,7,7],预期输入[2,3,1344,800]有1个通道,但得到了3个通道 基于FasterRCNN体系结构,我假设问题出在transform组件中,因为它试图规范化最初为灰度而非RGB的图像: FasterRCNN( (transform): General
RuntimeError:给定的组=1,大小的权重[64,1,7,7],预期输入[2,3,1344,800]有1个通道,但得到了3个通道
基于FasterRCNN体系结构,我假设问题出在transform
组件中,因为它试图规范化最初为灰度而非RGB的图像:
FasterRCNN(
(transform): GeneralizedRCNNTransform(
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Resize(min_size=(800,), max_size=1333, mode='bilinear')
)
(backbone): Sequential(
(conv0): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(denseblock1): _DenseBlock(
(denselayer1): _DenseLayer(
(norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
...............
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(rpn): RegionProposalNetwork(
(anchor_generator): AnchorGenerator()
(head): RPNHead(
(conv): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(cls_logits): Conv2d(1024, 15, kernel_size=(1, 1), stride=(1, 1))
(bbox_pred): Conv2d(1024, 60, kernel_size=(1, 1), stride=(1, 1))
)
)
(roi_heads): RoIHeads(
(box_roi_pool): MultiScaleRoIAlign()
(box_head): TwoMLPHead(
(fc6): Linear(in_features=50176, out_features=1024, bias=True)
(fc7): Linear(in_features=1024, out_features=1024, bias=True)
)
(box_predictor): FastRCNNPredictor(
(cls_score): Linear(in_features=1024, out_features=2, bias=True)
(bbox_pred): Linear(in_features=1024, out_features=8, bias=True)
)
)
)
我说得对吗?如果是,我如何解决此问题?是否有处理灰度图像和FasterRCNN的标准实践
提前谢谢!真的很感激 标准化(平均值=[0.485,0.456,0.406],标准值=[0.229,0.224,0.225])
表示标准化过程应用于输入图像的所有3个通道0.485
应用于R通道,0.456
应用于G通道,0.406
应用于B通道。标准偏差值也是如此
主干网的第一个转换层需要一个单通道输入,这就是您得到此错误的原因
您可以执行以下操作来解决此问题
重新定义GeneralizedRCNNTransform并将其附加到模型。你可以这样做:
#将这些部件组合在一个FasterRCNN模型中
模型=FasterRCNN(主干网,num\u类=2,rpn\u锚定\u生成器=锚定\u生成器,盒\u roi\u池=roi\u池)
#变化
grcnn=torchvision.models.detection.transform.GeneralizedRCNNTransform(最小尺寸=800,最大尺寸=1333,图像平均值=0.485,图像标准=0.229)
model.transform=grcnn
型号.至(设备)
FasterRCNN(
(transform): GeneralizedRCNNTransform(
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Resize(min_size=(800,), max_size=1333, mode='bilinear')
)
(backbone): Sequential(
(conv0): Conv2d(1, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(norm0): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu0): ReLU(inplace=True)
(pool0): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(denseblock1): _DenseBlock(
(denselayer1): _DenseLayer(
(norm1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu1): ReLU(inplace=True)
(conv1): Conv2d(64, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)
(norm2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu2): ReLU(inplace=True)
(conv2): Conv2d(128, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
)
...............
(norm5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(rpn): RegionProposalNetwork(
(anchor_generator): AnchorGenerator()
(head): RPNHead(
(conv): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(cls_logits): Conv2d(1024, 15, kernel_size=(1, 1), stride=(1, 1))
(bbox_pred): Conv2d(1024, 60, kernel_size=(1, 1), stride=(1, 1))
)
)
(roi_heads): RoIHeads(
(box_roi_pool): MultiScaleRoIAlign()
(box_head): TwoMLPHead(
(fc6): Linear(in_features=50176, out_features=1024, bias=True)
(fc7): Linear(in_features=1024, out_features=1024, bias=True)
)
(box_predictor): FastRCNNPredictor(
(cls_score): Linear(in_features=1024, out_features=2, bias=True)
(bbox_pred): Linear(in_features=1024, out_features=8, bias=True)
)
)
)