Python 视频上的目标检测与图像上的目标检测具有不同的预测_Python_Tensorflow_Object Detection_Object Detection Api_Faster Rcnn

Python 视频上的目标检测与图像上的目标检测具有不同的预测

python tensorflow

Python 视频上的目标检测与图像上的目标检测具有不同的预测,python,tensorflow,object-detection,object-detection-api,faster-rcnn,Python,Tensorflow,Object Detection,Object Detection Api,Faster Rcnn,我想测试我创建的模型。在测试时，我注意到第一个和第二个代码的预测是不同的。两个代码使用相同的冻结干涉图，并使用相同的帧进行对象检测。如何更改第二个代码以获得与第一个代码相同的结果 cap = cv2.VideoCapture("InputVideo.mp4") frame_array = [] with detection_graph.as_default(): with tf.Session(graph=detection_graph) as sess: while cap.isO

我想测试我创建的模型。在测试时，我注意到第一个和第二个代码的预测是不同的。两个代码使用相同的冻结干涉图，并使用相同的帧进行对象检测。如何更改第二个代码以获得与第一个代码相同的结果

cap = cv2.VideoCapture("InputVideo.mp4")
frame_array = []
with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    while cap.isOpened():
      frameId = int(round(cap.get(1)))
      ret, image_np = cap.read()
      if ret == True:
          if frameId % 1 == 0:
              image_np_expanded = np.expand_dims(image_np, axis=0)
              image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
              boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
              scores = detection_graph.get_tensor_by_name('detection_scores:0')
              classes = detection_graph.get_tensor_by_name('detection_classes:0')
              num_detections = detection_graph.get_tensor_by_name('num_detections:0')
              (boxes, scores, classes, num_detections) = sess.run(
                  [boxes, scores, classes, num_detections],
                   feed_dict={image_tensor: image_np_expanded})
              vis_util.visualize_boxes_and_labels_on_image_array(
                  image_np,
                  np.squeeze(boxes),
                  np.squeeze(classes).astype(np.int32),
                  np.squeeze(scores),
                  category_index,
                  use_normalized_coordinates=True,
                  line_thickness=8,
                  min_score_thresh=.35)
              frame_array.append(image_np)
      else:
          break

第二代码

def load_image_into_numpy_array(image):
  (im_width, im_height) = image.size
  return np.array(image.getdata()).reshape((im_height, im_width, 3)).astype(np.uint8)
%matplotlib inline
with detection_graph.as_default():
  with tf.Session(graph=detection_graph) as sess:
    for image_path in TEST_IMAGE_PATHS:
      image = Image.open(image_path)
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
      scores = detection_graph.get_tensor_by_name('detection_scores:0')
      classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')
      # Actual detection.
      (boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      # Visualization of the results of a detection.
      vis_util.visualize_boxes_and_labels_on_image_array(
          image_np,
          np.squeeze(boxes),
          np.squeeze(classes).astype(np.int32),
          np.squeeze(scores),
          category_index,
          use_normalized_coordinates=True,
          line_thickness=8,
          min_score_thresh=.35
      )
      plt.figure(figsize=IMAGE_SIZE)
      plt.show()

你检查输入图像了吗？由于两种情况下的模型相同，唯一可能的原因是输入不同。可能您有一个原始帧，但有来自视频的压缩帧（例如，h264代码，有损）。可能您有不同的频道顺序（cv2.VideoCapture默认以BRG格式返回帧，而原始帧可能以RGB格式返回）。也许你需要这样做

image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)