Deep learning YOLOv4在Colab Pro上报告了30小时的训练时间,只有340张训练图像

Deep learning YOLOv4在Colab Pro上报告了30小时的训练时间,只有340张训练图像,deep-learning,computer-vision,google-colaboratory,object-detection,yolo,Deep Learning,Computer Vision,Google Colaboratory,Object Detection,Yolo,我试图在Colab Pro上测试我的模型,我只使用了340张16节课的训练图像进行测试。然而,Colab Pro告诉我还有大约30小时的训练时间: (next mAP calculation at 1200 iterations) Last accuracy mAP@0.5 = 0.37 %, best = 0.37 % 1187: 3.270728, 3.027621 avg loss, 0.010000 rate, 1.429193 seconds, 75968 images, 30

我试图在Colab Pro上测试我的模型,我只使用了340张16节课的训练图像进行测试。然而,Colab Pro告诉我还有大约30小时的训练时间:

(next mAP calculation at 1200 iterations) 
 Last accuracy mAP@0.5 = 0.37 %, best = 0.37 % 
 1187: 3.270728, 3.027621 avg loss, 0.010000 rate, 1.429193 seconds, 75968 images, 30.824708 hours left
Loaded: 1.136631 seconds - performance bottleneck on CPU or Disk HDD/SSD
...
...
...
 (next mAP calculation at 1300 iterations) 
 Last accuracy mAP@0.5 = 0.33 %, best = 0.37 % 
 1278: 3.231166, 2.967602 avg loss, 0.010000 rate, 2.552415 seconds, 81792 images, 30.512658 hours left
Loaded: 0.712928 seconds - performance bottleneck on CPU or Disk HDD/SSD
我不知道它为什么这样做。我只有一个小数据集

以下是我的cnfg参数:

[net]
# Testing
#batch=1
#subdivisions=1
# Training
batch=64
subdivisions=16
width=1024
height=1024
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1
 
learning_rate=0.01
burn_in=1000
max_batches = {max_batches}
policy=steps
steps={steps_str}
scales=.1,.1
 
[convolutional]
batch_normalize=1
filters=32
size=3
stride=2
pad=1
activation=leaky
 
[convolutional]
batch_normalize=1
filters=64
size=3
stride=2
pad=1
activation=leaky
 
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
 
[route]
layers=-1
groups=2
group_id=1
 
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
 
[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky
 
[route]
layers = -1,-2
 
[convolutional]
batch_normalize=1
filters=64
size=1
stride=1
pad=1
activation=leaky
 
[route]
layers = -6,-1
 
[maxpool]
size=2
stride=2
 
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
 
[route]
layers=-1
groups=2
group_id=1
 
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
 
[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky
 
[route]
layers = -1,-2
 
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
 
[route]
layers = -6,-1
 
[maxpool]
size=2
stride=2
 
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
 
[route]
layers=-1
groups=2
group_id=1
 
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
 
[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky
 
[route]
layers = -1,-2
 
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
 
[route]
layers = -6,-1
 
[maxpool]
size=2
stride=2
 
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
 
##################################
 
[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky
 
[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky
 
[convolutional]
size=1
stride=1
pad=1
filters={num_filters}
activation=linear
 
 
 
[yolo]
mask = 3,4,5
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
classes={num_classes}
num=6
jitter=.3
scale_x_y = 1.05
cls_normalizer=1.0
truth_thresh = 1
random=1
nms_kind=greedynms
beta_nms=0.6
ignore_thresh = .9 
iou_normalizer=0.5 
iou_loss=giou
 
[route]
layers = -4
 
[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky
 
[upsample]
stride=2
 
[route]
layers = -1, 23
 
[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky
 
[convolutional]
size=1
stride=1
pad=1
filters={num_filters}
activation=linear
 
[yolo]
mask = 1,2,3
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
classes={num_classes}
num=6
jitter=.3
scale_x_y = 1.05
cls_normalizer=1.0
ignore_thresh = .9 
iou_normalizer=0.5
iou_loss=giou
truth_thresh = 1
random=1
nms_kind=greedynms
beta_nms=0.6

您的培训取决于
max_batches
参数,该参数基本上是最大批次数


根据这一建议,
max_batches
应为
classes*2000
。在你的例子中,它是16*2000=32000。这就是为什么尽管数据集很小,但它需要更多的时间。

非常感谢。突然,我得到了一个输出:
v3(giou-loss,Normalizer:(iou:0.50,obj:1.00,cls:1.00)Region 30 Avg(iou:0.000000),count:6,class_-loss=-nan,iou-loss=-nan,total_-loss=-nan
为什么我的iou是0,其他一切都是-nan?