Tensorflow 为什么在多尺度_锚定_生成器中将normalize_坐标设置为false时，检测模型的评估度量（mAP）始终为0？_Tensorflow_Object Detection Api

Tensorflow 为什么在多尺度_锚定_生成器中将normalize_坐标设置为false时，检测模型的评估度量（mAP）始终为0？

tensorflow

Tensorflow 为什么在多尺度_锚定_生成器中将normalize_坐标设置为false时，检测模型的评估度量（mAP）始终为0？,tensorflow,object-detection-api,Tensorflow,Object Detection Api,我正在一个数据集上训练一个efficientdet\u d0模型（），该数据集包含6000个训练图像和600个测试图像，形状（15001500）出于特定原因，我必须在pipeline.config中将normalize_坐标设置为False： anchor_generator { multiscale_anchor_generator { min_level: 3 max_level: 7 anchor_scale: 4.0 aspect_ratios: [

我正在一个数据集上训练一个

efficientdet\u d0

模型（），该数据集包含6000个训练图像和600个测试图像，形状

（15001500）

出于特定原因，我必须在

pipeline.config

中将

normalize_坐标设置为False：
anchor_generator {
  multiscale_anchor_generator {
    min_level: 3
    max_level: 7
    anchor_scale: 4.0
    aspect_ratios: [0.423, 0.653, 1.0, 1.532, 2.365]
    scales_per_octave: 3
    normalize_coordinates: false
  }
}

我正在使用从Imagenet（）上培训的efficientnet\u b0
转移学习
问题是，即使在10k
步骤之后，mAP
仍保持为零（或大约为0）。我尝试过不同学习率的训练，但我有同样的问题。然而，当将normalize_坐标设置为True（这是源配置中的默认设置）时，经过相同的步骤数，我得到了86%的映射
我试图精确定位源代码，以了解这种规范化发生在何处，我发现：
if self.\u规范化\u坐标：
如果im_高度==1或im_宽度==1：
升值误差(
“标准化坐标是在构建时请求的”
'MultiscaleGridAnchorGenerator，但随后调用'
“生成未提供维度信息。”）
锚定网格=框列表到标准化坐标(
锚定网格、im高度、im宽度、检查范围=假）
锚定网格列表。追加（锚定网格）

我知道生成的锚用于编码地面真相边界框（在转换为.tfrecords
时进行规范化），并用于解码原始预测。因此，我还尝试在一个数据集上进行训练，其中groundtruth边界框没有标准化（因为锚没有被非标准化），但是我遇到了同样的问题（甚至变得更糟）
因此，我想知道，我将标准化_坐标设置为False这一事实是否是一个失败的原因，我必须使用True进行训练？或者，我是否需要训练更长的时间（大约100k步）？也许我必须修改一些配置的其他部分以使其工作（比如box\u编码器
）
我不想规范化锚点的原因是因为模型导出。在推断时，图像大小可变（与训练时不同）。因此，在导出模型时，我正在将输入大小调整器
覆盖到pad\u to\u多个大小调整器{multiple:128，convert\u to\u grayscale:false}
。当normalize_坐标：true
时，此导出失败（会引发ValueError
，因为在没有输入的情况下，大小调整器不会提供形状信息），这就是我将其设置为False的原因
NB这是我的配置文件
model {
  ssd {
    inplace_batchnorm_update: true
    freeze_batchnorm: false
    num_classes: 13
    add_background_class: false
    box_coder {
      faster_rcnn_box_coder {
        y_scale: 10.0
        x_scale: 10.0
        height_scale: 5.0
        width_scale: 5.0
      }
    }
    matcher {
      argmax_matcher {
        matched_threshold: 0.5
        unmatched_threshold: 0.5
        ignore_thresholds: false
        negatives_lower_than_unmatched: true
        force_match_for_each_row: true
        use_matmul_gather: true
      }
    }
    similarity_calculator {
      iou_similarity {
      }
    }
    encode_background_as_zeros: true
    anchor_generator {
      multiscale_anchor_generator {
        min_level: 3
        max_level: 7
        anchor_scale: 4.0
        aspect_ratios: [0.423, 0.653, 1.0, 1.532, 2.365]
        scales_per_octave: 3
        normalize_coordinates: false
      }
    }
    image_resizer {
      keep_aspect_ratio_resizer {
        min_dimension: 1536
        max_dimension: 1536
        pad_to_max_dimension: true
      }
    }
    box_predictor {
      weight_shared_convolutional_box_predictor {
        depth: 88
        class_prediction_bias_init: -4.6
        conv_hyperparams {
          force_use_bias: true
          activation: SWISH
          regularizer {
            l2_regularizer {
              weight: 0.00004
            }
          }
          initializer {
            random_normal_initializer {
              stddev: 0.01
              mean: 0.0
            }
          }
          batch_norm {
            scale: true
            decay: 0.99
            epsilon: 0.001
          }
        }
        num_layers_before_predictor: 3
        kernel_size: 3
        use_depthwise: true
      }
    }
    feature_extractor {
      type: "ssd_efficientnet-b0_bifpn_keras"
      conv_hyperparams {
        regularizer {
          l2_regularizer {
            weight: 3.9999998989515007e-05
          }
        }
        initializer {
          truncated_normal_initializer {
            mean: 0.0
            stddev: 0.029999999329447746
          }
        }
        activation: SWISH
        batch_norm {
          decay: 0.9900000095367432
          scale: true
          epsilon: 0.0010000000474974513
        }
        force_use_bias: true
      }
      bifpn {
        min_level: 3
        max_level: 7
        num_iterations: 3
        num_filters: 64
      }
    }
    loss {
      classification_loss {
        weighted_sigmoid_focal {
          alpha: 0.25
          gamma: 1.5
        }
      }
      localization_loss {
        weighted_smooth_l1 {
        }
      }
      classification_weight: 1.0
      localization_weight: 1.0
    }
    normalize_loss_by_num_matches: true
    normalize_loc_loss_by_codesize: true
    post_processing {
      batch_non_max_suppression {
        score_threshold: 1e-8
        iou_threshold: 0.5
        max_detections_per_class: 100
        max_total_detections: 100
      }
      score_converter: SIGMOID
    }
  }
}

train_config {
  fine_tune_checkpoint: "PATH_TO_CHECKPOINT/ckpt-0"
  fine_tune_checkpoint_type: "classification"
  fine_tune_checkpoint_version: V2
  num_steps: 32250
  startup_delay_steps: 0
  replicas_to_aggregate: 1
  max_number_of_boxes: 100
  retain_original_images: true 
  unpad_groundtruth_tensors: false
  batch_size: 1
  sync_replicas: false
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        constant_learning_rate {
          learning_rate: 1e-4
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }
  data_augmentation_options {
    random_horizontal_flip {
    }
  }
  data_augmentation_options {
    random_adjust_brightness  {
    }
  }
}

train_input_reader: {
  name: "" 
  label_map_path: "PATH_TO_LABELMAP/label_map.pbtxt"
  tf_record_input_reader {
    input_path: "PATH_TO_DATA/train.record"
  }
}

eval_config: {
  batch_size: 1
  num_visualizations: 10 
  visualization_export_dir : ""
  metrics_set: "pascal_voc_detection_metrics"
  use_moving_averages: false
  min_score_threshold: 0.5
  force_no_resize: false   
}
eval_input_reader: {
  name: ""
  label_map_path: "PATH_TO_LABELMAP/label_map.pbtxt"
  shuffle: false
  num_epochs: 1
  sample_1_of_n_examples: 1
  tf_record_input_reader {
    input_path: "PATH_TO_DATA/val.record"
  }
}