Neural network Resnet-18作为更快R-CNN的主干
我用pytorch编码,我想用作更快的R-RCNN的主干。打印的结构时,这是输出:Neural network Resnet-18作为更快R-CNN的主干,neural-network,deep-learning,pytorch,resnet,faster-rcnn,Neural Network,Deep Learning,Pytorch,Resnet,Faster Rcnn,我用pytorch编码,我想用作更快的R-RCNN的主干。打印的结构时,这是输出: >>import torch >>import torchvision >>import numpy as np >>import torchvision.models as models >>resnet18 = models.resnet18(pretrained=False) >>print(resnet18) ResNet(
>>import torch
>>import torchvision
>>import numpy as np
>>import torchvision.models as models
>>resnet18 = models.resnet18(pretrained=False)
>>print(resnet18)
ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): BasicBlock(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(layer2): Sequential(
(0): BasicBlock(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(layer3): Sequential(
(0): BasicBlock(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(layer4): Sequential(
(0): BasicBlock(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): BasicBlock(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): Linear(in_features=512, out_features=1000, bias=True)
)
我的问题是,直到哪一层它是特征提取器?AdaptiveAvgPool2d
是否应该成为更快的R-CNN主干的一部分
在中,演示了如何使用任意主干训练掩码R-CNN,我想用更快的R-CNN做同样的事情,并用resnet-18训练更快的R-CNN,但在哪一层应该成为特征提取程序的一部分之前,我很困惑
我知道如何使用resnet+功能金字塔网络作为主干,我的问题是关于resent torchvision自动为vgg和mobilenet提取特征层
.features
自动从主干模型中提取所需的相关层,并将其传递到对象检测管道。您可以在函数中了解更多关于此的信息
在您共享的应用程序中,您只需将backbone=torchvision.models.mobilenet\u v2(pretrained=True).功能更改为backbone=resnet\u fpn\u backbone('resnet50',pretrained\u backbone)
为了让您简单了解,resnet\u fpn\u backbone
函数使用您提供的resnet backbone\u名称(18、34、50…),并使用提取层1到4。此backbonewithFPN将在中用作主干。如果我们要使用自适应平均池的输出,我们将此代码用于不同的Resnet:
# backbone
if backbone_name == 'resnet_18':
resnet_net = torchvision.models.resnet18(pretrained=True)
modules = list(resnet_net.children())[:-1]
backbone = nn.Sequential(*modules)
backbone.out_channels = 512
elif backbone_name == 'resnet_34':
resnet_net = torchvision.models.resnet34(pretrained=True)
modules = list(resnet_net.children())[:-1]
backbone = nn.Sequential(*modules)
backbone.out_channels = 512
elif backbone_name == 'resnet_50':
resnet_net = torchvision.models.resnet50(pretrained=True)
modules = list(resnet_net.children())[:-1]
backbone = nn.Sequential(*modules)
backbone.out_channels = 2048
elif backbone_name == 'resnet_101':
resnet_net = torchvision.models.resnet101(pretrained=True)
modules = list(resnet_net.children())[:-1]
backbone = nn.Sequential(*modules)
backbone.out_channels = 2048
elif backbone_name == 'resnet_152':
resnet_net = torchvision.models.resnet152(pretrained=True)
modules = list(resnet_net.children())[:-1]
backbone = nn.Sequential(*modules)
backbone.out_channels = 2048
elif backbone_name == 'resnet_50_modified_stride_1':
resnet_net = resnet50(pretrained=True)
modules = list(resnet_net.children())[:-1]
backbone = nn.Sequential(*modules)
backbone.out_channels = 2048
elif backbone_name == 'resnext101_32x8d':
resnet_net = torchvision.models.resnext101_32x8d(pretrained=True)
modules = list(resnet_net.children())[:-1]
backbone = nn.Sequential(*modules)
backbone.out_channels = 2048
如果要使用卷积特征映射,请使用以下代码:
# backbone
if backbone_name == 'resnet_18':
resnet_net = torchvision.models.resnet18(pretrained=True)
modules = list(resnet_net.children())[:-2]
backbone = nn.Sequential(*modules)
elif backbone_name == 'resnet_34':
resnet_net = torchvision.models.resnet34(pretrained=True)
modules = list(resnet_net.children())[:-2]
backbone = nn.Sequential(*modules)
elif backbone_name == 'resnet_50':
resnet_net = torchvision.models.resnet50(pretrained=True)
modules = list(resnet_net.children())[:-2]
backbone = nn.Sequential(*modules)
elif backbone_name == 'resnet_101':
resnet_net = torchvision.models.resnet101(pretrained=True)
modules = list(resnet_net.children())[:-2]
backbone = nn.Sequential(*modules)
elif backbone_name == 'resnet_152':
resnet_net = torchvision.models.resnet152(pretrained=True)
modules = list(resnet_net.children())[:-2]
backbone = nn.Sequential(*modules)
elif backbone_name == 'resnet_50_modified_stride_1':
resnet_net = resnet50(pretrained=True)
modules = list(resnet_net.children())[:-2]
backbone = nn.Sequential(*modules)
elif backbone_name == 'resnext101_32x8d':
resnet_net = torchvision.models.resnext101_32x8d(pretrained=True)
modules = list(resnet_net.children())[:-2]
backbone = nn.Sequential(*modules)
我在torch和torchvision的新版本中使用了类似的东西
def get_resnet18_backbone_model(num_classes, pretrained):
from torchvision.models.detection.backbone_utils import resnet_fpn_backbone
print('Using fasterrcnn with res18 backbone...')
backbone = resnet_fpn_backbone('resnet18', pretrained=pretrained, trainable_layers=5)
anchor_generator = AnchorGenerator(
sizes=((16,), (32,), (64,), (128,), (256,)),
aspect_ratios=tuple([(0.25, 0.5, 1.0, 2.0) for _ in range(5)]))
roi_pooler = torchvision.ops.MultiScaleRoIAlign(featmap_names=['0', '1', '2', '3'],
output_size=7, sampling_ratio=2)
# put the pieces together inside a FasterRCNN model
model = FasterRCNN(backbone, num_classes=num_classes,
rpn_anchor_generator=anchor_generator,
box_roi_pool=roi_pooler)
return model
请注意,resnet\u fpn\u backbone()已将backbone.out\u通道设置为正确的值。我测试了resnet18(pretrained=True)。功能但它给出了:AttributeError:“resnet”对象没有属性“features”,您可以替换backbone=torchvision.models.resnet18(pretrained=True).features
withbackbone=resnet\u fpn\u backbone('resnet50',pretrained\u backbone)
您需要包含来自.backbone的utils导入resnet\u fpn\u backbone
resnet\u fpn\u backbone返回基于resnet的功能金字塔网络,我希望resnet作为主干。据我所知,torchvision目前默认支持带有fpn的resnet。如果需要,您可以通过提取他们的回购协议进行定制。我会再次做我的研究,我会确认。我认为这段代码解决了它:[code]resnet\u net=torchvision.models.resnet18(pretrained=True)modules=list(resnet\u net.children())[:-2]backbone=nn.Sequential(*modules)backbone.out\u channels=512[\code]