Tensorflow 加载图像进行检测的更有效方法

Tensorflow 加载图像进行检测的更有效方法,tensorflow,object-detection,Tensorflow,Object Detection,我正在使用tensorflow对象检测api来执行一些半实时对象检测任务。 摄像机将以每秒2张图像的速度拍摄图像。每个图像将被裁剪成4个小图像,因此我总共需要每秒处理8个图像 我的检测模型已导出到冻结的图形(.pb文件)中,并加载到GPU内存中。然后我将图像加载到numpy数组中,以将它们输入到我的模型中 检测本身每幅图像只需约0.1秒,然而,加载每幅图像需要约0.45秒 我使用的脚本是根据object detection api()提供的代码示例修改的,它读取每个图像并将其转换为numpy数组

我正在使用tensorflow对象检测api来执行一些半实时对象检测任务。 摄像机将以每秒2张图像的速度拍摄图像。每个图像将被裁剪成4个小图像,因此我总共需要每秒处理8个图像

我的检测模型已导出到冻结的图形(.pb文件)中,并加载到GPU内存中。然后我将图像加载到numpy数组中,以将它们输入到我的模型中

检测本身每幅图像只需约0.1秒,然而,加载每幅图像需要约0.45秒

我使用的脚本是根据object detection api()提供的代码示例修改的,它读取每个图像并将其转换为numpy数组,然后输入到检测模型中。这个过程中最耗时的部分是
将图像加载到数组中
,几乎需要0.45秒

脚本如下所示:

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import timeit
import scipy.misc


from collections import defaultdict
from io import StringIO
from matplotlib import pyplot as plt
from PIL import Image


from utils import label_map_util

from utils import visualization_utils as vis_util

# Path to frozen detection graph. This is the actual model that is used for the
# object detection.
PATH_TO_CKPT = 'animal_detection.pb'

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'animal_label_map.pbtxt')

NUM_CLASSES = 1


detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def,name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map,
                                                            max_num_classes=NUM_CLASSES,
                                                            use_display_name=True)
category_index = label_map_util.create_category_index(categories)

def load_image_into_numpy_array(image):
    (im_width, im_height) = image.size
    return np.array(image.getdata()).reshape(
        (im_height, im_width, 3)).astype(np.uint8)

# For the sake of simplicity we will use only 2 images:
    # image1.jpg
    # image2.jpg
    # If you want to test the code with your images, just add path to the
    # images to the TEST_IMAGE_PATHS.
PATH_TO_TEST_IMAGES_DIR = 'test'
TEST_IMAGE_PATHS = [
    os.path.join(PATH_TO_TEST_IMAGES_DIR,'image{}.png'.format(i)) for i in range(1, 10) ]

    # Size, in inches, of the output images.
IMAGE_SIZE = (12, 8)
config = tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
with detection_graph.as_default():
  with tf.Session(graph=detection_graph, config=config) as sess:
    for image_path in TEST_IMAGE_PATHS:
      start = timeit.default_timer()
      image = Image.open(image_path)
      # the array based representation of the image will be used later in order to prepare the
      # result image with boxes and labels on it.
      image_np = load_image_into_numpy_array(image)
      # Expand dimensions since the model expects images to have shape: [1, None, None, 3]
      image_np_expanded = np.expand_dims(image_np, axis=0)
      image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
      end = timeit.default_timer()
      print(end-start)
      start = timeit.default_timer()
      # Each box represents a part of the image where a particular object was detected.
      boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
      # Each score represent how level of confidence for each of the objects.
      # Score is shown on the result image, together with the class label.
      scores = detection_graph.get_tensor_by_name('detection_scores:0')
      classes = detection_graph.get_tensor_by_name('detection_classes:0')
      num_detections = detection_graph.get_tensor_by_name('num_detections:0')
      # Actual detection.
      (boxes, scores, classes, num_detections) = sess.run(
          [boxes, scores, classes, num_detections],
          feed_dict={image_tensor: image_np_expanded})
      stop = timeit.default_timer()
      print (stop - start)
      # Visualization of the results of a detection.
    vis_util.visualize_boxes_and_labels_on_image_array(
         image_np,
         np.squeeze(boxes),
         np.squeeze(classes).astype(np.int32),
         np.squeeze(scores),
         category_index,
         use_normalized_coordinates=True,
         line_thickness=2)
我正在考虑一种更有效的方法来加载相机生成的图像,第一种方法是避免使用numpy阵列,并尝试使用tensorflow原生方法来加载图像,但我不知道从哪里开始,因为我对tensorflow非常陌生

如果我能找到一些tensorflow加载图像的方法,也许我可以将4张图像分为一批,然后将它们输入到我的模型中,这样我的速度可能会有所提高

一个不成熟的想法是尝试将从一张原始图像中裁剪出来的4张小图像保存到tf_记录文件中,并将tf_记录文件作为一个批加载到模型中,但我不知道如何实现这一点


任何帮助都将不胜感激。

我找到了一种解决方案,可以将图像加载时间从0.4秒减少到0.01秒。我会在这里张贴答案,以防有人也有同样的问题。 我们可以在opencv中使用imread,而不是使用PIL.Image和numpy。 我还成功地对图像进行了批处理,以便我们能够实现更好的加速

脚本如下所示:

import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tensorflow as tf
import timeit
import cv2


from collections import defaultdict

from utils import label_map_util

from utils import visualization_utils as vis_util

MODEL_PATH = sys.argv[1]
IMAGE_PATH = sys.argv[2]
BATCH_SIZE = int(sys.argv[3])
# Path to frozen detection graph. This is the actual model that is used for the
# object detection.
PATH_TO_CKPT = os.path.join(MODEL_PATH, 'frozen_inference_graph.pb')

# List of the strings that is used to add correct label for each box.
PATH_TO_LABELS = os.path.join('data', 'animal_label_map.pbtxt')

NUM_CLASSES = 1

detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def,name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map,
                                                            max_num_classes=NUM_CLASSES,
                                                            use_display_name=True)
category_index = label_map_util.create_category_index(categories)

PATH_TO_TEST_IMAGES_DIR = IMAGE_PATH
TEST_IMAGE_PATHS = [
    os.path.join(PATH_TO_TEST_IMAGES_DIR,'image{}.png'.format(i)) for i in range(1, 129) ]

config = tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = tf.OptimizerOptions.ON_1
with detection_graph.as_default():
  with tf.Session(graph=detection_graph, config=config) as sess:
    for i in range(0, len(TEST_IMAGE_PATHS), BATCH_SIZE):
        images = []
        start = timeit.default_timer()
        for j in range(0, BATCH_SIZE):
            image = cv2.imread(TEST_IMAGE_PATHS[i+j])
            image = np.expand_dims(image, axis=0)
            images.append(image)
            image_np_expanded = np.concatenate(images, axis=0)
        image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
        # Each box represents a part of the image where a particular object was detected.
        boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
        # Each score represent how level of confidence for each of the objects.
        # Score is shown on the result image, together with the class label.
        scores = detection_graph.get_tensor_by_name('detection_scores:0')
        classes = detection_graph.get_tensor_by_name('detection_classes:0')
        num_detections = detection_graph.get_tensor_by_name('num_detections:0')
        # Actual detection.
        (boxes, scores, classes, num_detections) = sess.run(
            [boxes, scores, classes, num_detections],
            feed_dict={image_tensor: image_np_expanded})
        stop = timeit.default_timer()
        print (stop - start)