Deeplab tensorflow实现:培训自定义数据。如何选择参数?
我尝试使用我的图像和分割模板数据集来训练deeplabv3+。然而,官方的github项目()只提供培训VOC、城市景观和ADE20K数据集。因此,尝试按照VOC数据集映射来构建我的数据集。我的目标是从视频中分割桌子。我从一个视频中得到了80个连续帧。图像形状为1920*1080。我使用“Labelme”为这80幅图像创建了表的掩码。视频帧中的表格类似。我将它们分开,60个用于培训,20个用于验证。结构类似于VOC数据集映射: JPEG图像:对于RGB彩色图像 分段:对于包含train文件名、val文件名和trainval文件名的txt文件 SegmentationClass:用于带有彩色贴图的遮罩图像 之后,我修改了“download_and_convert_voc2012.sh”以删除颜色映射并创建tfrecord文件。然后,修改“data_gergenerator.py”以更改VOC的数据信息。换句话说,我更改了原始代码:Deeplab tensorflow实现:培训自定义数据。如何选择参数?,tensorflow,Tensorflow,我尝试使用我的图像和分割模板数据集来训练deeplabv3+。然而,官方的github项目()只提供培训VOC、城市景观和ADE20K数据集。因此,尝试按照VOC数据集映射来构建我的数据集。我的目标是从视频中分割桌子。我从一个视频中得到了80个连续帧。图像形状为1920*1080。我使用“Labelme”为这80幅图像创建了表的掩码。视频帧中的表格类似。我将它们分开,60个用于培训,20个用于验证。结构类似于VOC数据集映射: JPEG图像:对于RGB彩色图像 分段:对于包含train文件名、v
_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 1464,
'train_aug': 10582,
'trainval': 2913,
'val': 1449,
},
num_classes=21,
ignore_label=255,
)
到
下一步是使用下面的commond进行培训:
python3 deeplab/train.py \
--logtostderr \
--training_number_of_steps=10000 \
--learning_rate_decay_step=500 \
--train_split="train" \
--base_learning_rate=0.0001 \
--adan_learning_rate=0.0001 \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--train_crop_size="513,513" \
--train_batch_size=1 \
--dataset="pascal_voc_seg" \
--tf_initial_checkpoint="/home/.../models/check_point/xception/model.ckpt.index" \
--train_logdir="/home/.../models/train_log" \
--dataset_dir="/home/.../models/research/deeplab/datasets/pascal_voc_seg/tfrecord"
初始检查点是“Exception65_coco_voc_trainval”,tfrecord文件的数据集是从我的80幅图像数据生成的。然而,在10000个步骤中,训练过程上下损失约4.8~5.5。我尝试降低学习率,但损失没有减少。我试图将“train_crop_size”改为“19201080”,但我的电脑坏了。我是否应该将图像大小调整到“500*500”左右?然而,我并没有改变我的形象。经过10000步的训练。让我展示最后100个步骤:
I0201 21:24:03.829240 140124578015040 learning.py:507] global step 9900: loss = 5.0399 (3.194 sec/step)
INFO:tensorflow:global step 9910: loss = 4.8355 (3.152 sec/step)
I0201 21:24:35.473860 140124578015040 learning.py:507] global step 9910: loss = 4.8355 (3.152 sec/step)
INFO:tensorflow:global step 9920: loss = 4.8337 (3.179 sec/step)
I0201 21:25:07.198329 140124578015040 learning.py:507] global step 9920: loss = 4.8337 (3.179 sec/step)
INFO:tensorflow:global step 9930: loss = 4.8411 (3.156 sec/step)
I0201 21:25:38.925366 140124578015040 learning.py:507] global step 9930: loss = 4.8411 (3.156 sec/step)
INFO:tensorflow:global step 9940: loss = 4.8540 (3.188 sec/step)
I0201 21:26:10.489563 140124578015040 learning.py:507] global step 9940: loss = 4.8540 (3.188 sec/step)
INFO:tensorflow:global step 9950: loss = 4.8426 (3.221 sec/step)
I0201 21:26:42.325965 140124578015040 learning.py:507] global step 9950: loss = 4.8426 (3.221 sec/step)
INFO:tensorflow:global step 9960: loss = 4.8415 (3.130 sec/step)
I0201 21:27:14.000005 140124578015040 learning.py:507] global step 9960: loss = 4.8415 (3.130 sec/step)
INFO:tensorflow:global step 9970: loss = 4.9121 (3.163 sec/step)
I0201 21:27:45.751608 140124578015040 learning.py:507] global step 9970: loss = 4.9121 (3.163 sec/step)
INFO:tensorflow:global step 9980: loss = 4.8351 (3.145 sec/step)
I0201 21:28:17.505195 140124578015040 learning.py:507] global step 9980: loss = 4.8351 (3.145 sec/step)
INFO:tensorflow:global step 9990: loss = 4.8401 (3.153 sec/step)
I0201 21:28:49.131623 140124578015040 learning.py:507] global step 9990: loss = 4.8401 (3.153 sec/step)
INFO:tensorflow:Recording summary at step 9990.
I0201 21:28:50.591169 140119415715584 supervisor.py:1050] Recording summary at step 9990.
INFO:tensorflow:global step 10000: loss = 4.8512 (3.149 sec/step)
I0201 21:29:21.474581 140124578015040 learning.py:507] global step 10000: loss = 4.8512 (3.149 sec/step)
INFO:tensorflow:Stopping Training.
I0201 21:29:21.475129 140124578015040 learning.py:777] Stopping Training.
INFO:tensorflow:Finished training! Saving model to disk.
I0201 21:29:21.475337 140124578015040 learning.py:785] Finished training! Saving model to disk.
/home/.../.local/lib/python3.6/site-packages/tensorflow/python/summary/writer/writer.py:386: UserWarning: Attempting to use a closed FileWriter. The operation will be a noop unless the FileWriter is explicitly reopened.
warnings.warn("Attempting to use a closed FileWriter. "
培训后,我使用下面的commond导出模型:
python3 deeplab/export_model.py \
--logtostderr \
--checkpoint_path="/home/.../models/train_log/model.ckpt-10000" \
--export_path="/home/.../models/mode_export/table_seg_graph.pb" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--num_classes=2 \
--crop_size=513 \
--crop_size=513 \
--inference_scales=1.0
def run_visualization(url):
try:
original_im = Image.open('/home/.../models/research/deeplab/datasets/pascal_voc_seg/Tabledata/JPEGImages/v00019.jpg')
except IOErr:
...
我得到了权重模型“table_seg_graph.pb”。在此之后,我尝试修改“deeplab_demo.ipynb”以使用此模型文件。我删除了关于解压tar文件的代码,并在下面添加代码以加载pb文件:
class DeepLabModel(object):
def __init__(self, tarball_path):
...
with tf.gfile.GFile(tarball_path, "rb") as f:
graph_def = tf.GraphDef()
graph_def.ParseFromString(f.read())
...
...
我还将模型下载部分更改为模型路径:
download_path = '/home.../models/mode_export/table_seg_graph.pb'
MODEL = DeepLabModel(download_path)
然后,我加载表格图像以测试模型:
python3 deeplab/export_model.py \
--logtostderr \
--checkpoint_path="/home/.../models/train_log/model.ckpt-10000" \
--export_path="/home/.../models/mode_export/table_seg_graph.pb" \
--model_variant="xception_65" \
--atrous_rates=6 \
--atrous_rates=12 \
--atrous_rates=18 \
--output_stride=16 \
--decoder_output_stride=4 \
--num_classes=2 \
--crop_size=513 \
--crop_size=513 \
--inference_scales=1.0
def run_visualization(url):
try:
original_im = Image.open('/home/.../models/research/deeplab/datasets/pascal_voc_seg/Tabledata/JPEGImages/v00019.jpg')
except IOErr:
...
然而,结果是什么都没有。它没有检测到任何东西遮罩分区全部为黑色。
由于视频具有复制权,我无法向您显示图像结果
我想我应该把我的1920*1080调整到513*513。否则,我不知道。谁能告诉我应该选择什么样的参数
我应该选择什么“萎缩率”
我应该给出什么“学习率”
我应该给出什么样的“裁剪尺寸”?我也有同样的问题,没有调整尺寸的推断1920*1080有什么解决方案吗?