Python 对COCO数据集的一部分进行Detectron2培训_Python_Machine Learning_Deep Learning_Pytorch_Detectron

Python 对COCO数据集的一部分进行Detectron2培训

python machine-learning deep-learning pytorch

Python 对COCO数据集的一部分进行Detectron2培训,python,machine-learning,deep-learning,pytorch,detectron,Python,Machine Learning,Deep Learning,Pytorch,Detectron,我正在尝试使用Detectron2和COCO数据集训练模型，用于车辆和人员检测，但我在模型加载方面遇到了问题我在这里使用了关于SO和（filter.py文件）代码的帖子来过滤COCO数据集，以便只包含来自“person”、“car”、“bike”、“truck”和“bicycle”类的注释和图像。现在我的目录结构是： main - annotations: - instances_train2017_filtered.json - instances_val2017_fil

我正在尝试使用Detectron2和COCO数据集训练模型，用于车辆和人员检测，但我在模型加载方面遇到了问题

我在这里使用了关于SO和（filter.py文件）代码的帖子来过滤COCO数据集，以便只包含来自“person”、“car”、“bike”、“truck”和“bicycle”类的注释和图像。现在我的目录结构是：

main
  - annotations:
    - instances_train2017_filtered.json
    - instances_val2017_filtered.json
  - images:
    - train2017_filtered (lots of images inside)
    - val2017_filtered (lots of images inside)

基本上，我在这里做的唯一一件事就是删除与这些类不对应的文档和图像，并更改它们的ID（因此它们是从1到5）

然后我使用了Detectron2教程中的代码：

import random

import cv2
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.data.datasets import register_coco_instances
from detectron2.engine import DefaultTrainer, DefaultPredictor
from detectron2.config import get_cfg
import os

from detectron2.model_zoo import model_zoo
from detectron2.utils.visualizer import Visualizer

register_coco_instances("train",
                        {},
                        "/home/jakub/Projects/coco/annotations/instances_train2017_filtered.json",
                        "/home/jakub/Projects/coco/images/train2017_filtered/")

register_coco_instances("val",
                        {},
                        "/home/jakub/Projects/coco/annotations/instances_val2017_filtered.json",
                        "/home/jakub/Projects/coco/images/val2017_filtered/")

metadata = MetadataCatalog.get("train")
dataset_dicts = DatasetCatalog.get("train")

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("train",)
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 300
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 512
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 5

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.DATASETS.TEST = ("val", )
predictor = DefaultPredictor(cfg)

img = cv2.imread("demo/input.jpg")
outputs = predictor(img)

for d in random.sample(dataset_dicts, 1):
    im = cv2.imread(d["file_name"])
    outputs = predictor(im)
    v = Visualizer(im[:, :, ::-1],
                   metadata=metadata,
                   scale=0.8)
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2.imwrite('demo/output_retrained.jpg', out.get_image()[:, :, ::-1])

在培训期间，我遇到以下错误：

Unable to load 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (6, 1024) in the model!
Unable to load 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (6,) in the model!
Unable to load 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (20, 1024) in the model!
Unable to load 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (20,) in the model!
Unable to load 'roi_heads.mask_head.predictor.weight' to the model due to incompatible shapes: (80, 256, 1, 1) in the checkpoint but (5, 256, 1, 1) in the model!
Unable to load 'roi_heads.mask_head.predictor.bias' to the model due to incompatible shapes: (80,) in the checkpoint but (5,) in the model!

该模型无法预测训练后的任何有用信息，尽管训练期间的总损失有所减少。我知道，我应该得到警告，因为大小不匹配（我减少了类的数量），这是正常的，从我在互联网上看到的，但我没有得到“跳过”后，每个错误行。我认为这个模型实际上没有在这里加载任何东西，我想知道为什么以及如何修复这个问题

编辑

为了进行比较，在几乎相同的情况下，类似的行为被报告为一个问题，但它在每个错误行的末尾都“跳过”，使它们成为有效的警告，而不是错误：

此“警告”基本上表示您正在尝试从在不同数量的类上训练的模型初始化权重。正如你所读到的，这是意料之中的

我怀疑您没有从培训中获得任何结果，因为您的MetadataCatalog没有“thing_classes”属性集。你只是在打电话

MetadataCatalog.get("train")

召唤

MetadataCatalog.get("train").set(thing_classes=["person", "car", "bike", "truck", "bicycle"])

应该可以解决这个问题，但如果不能解决，我很确定您的json已损坏。

您是否发现了这个问题？@ErlendD。事实证明，这是一个警告，但由于某种未知的原因，“警告”部分已从打印的消息中删除。看见