对具有不同图像大小的数据集使用tensorflow TFRecords_Tensorflow

对具有不同图像大小的数据集使用tensorflow TFRecords

tensorflow

对具有不同图像大小的数据集使用tensorflow TFRecords,tensorflow,Tensorflow,在tensorflow教程的示例中，TFRecords的用法随MNIST数据集一起提供。 MNIST数据集转换为TFRecords文件，如下所示： def convert_to(data_set, name): images = data_set.images labels = data_set.labels num_examples = data_set.num_examples if images.shape[0] != num_examples: raise Va

在tensorflow教程的示例中，TFRecords的用法随MNIST数据集一起提供。 MNIST数据集转换为TFRecords文件，如下所示：

def convert_to(data_set, name):
  images = data_set.images
  labels = data_set.labels
  num_examples = data_set.num_examples

  if images.shape[0] != num_examples:
    raise ValueError('Images size %d does not match label size %d.' %
                     (images.shape[0], num_examples))
  rows = images.shape[1]
  cols = images.shape[2]
  depth = images.shape[3]

  filename = os.path.join(FLAGS.directory, name + '.tfrecords')
  print('Writing', filename)
  writer = tf.python_io.TFRecordWriter(filename)
  for index in range(num_examples):
    image_raw = images[index].tostring()
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(rows),
        'width': _int64_feature(cols),
        'depth': _int64_feature(depth),
        'label': _int64_feature(int(labels[index])),
        'image_raw': _bytes_feature(image_raw)}))
    writer.write(example.SerializeToString())
  writer.close()

def read_and_decode(filename_queue):
  reader = tf.TFRecordReader()
  _, serialized_example = reader.read(filename_queue)
  features = tf.parse_single_example(
      serialized_example,
      # Defaults are not specified since both keys are required.
      features={
          'image_raw': tf.FixedLenFeature([], tf.string),
          'label': tf.FixedLenFeature([], tf.int64),
      })

  # Convert from a scalar string tensor (whose single string has
  # length mnist.IMAGE_PIXELS) to a uint8 tensor with shape
  # [mnist.IMAGE_PIXELS].
  image = tf.decode_raw(features['image_raw'], tf.uint8)
  image.set_shape([mnist.IMAGE_PIXELS])

  # OPTIONAL: Could reshape into a 28x28 image and apply distortions
  # here.  Since we are not applying any distortions in this
  # example, and the next step expects the image to be flattened
  # into a vector, we don't bother.

  # Convert from [0, 255] -> [-0.5, 0.5] floats.
  image = tf.cast(image, tf.float32) * (1. / 255) - 0.5

  # Convert label from a scalar uint8 tensor to an int32 scalar.
  label = tf.cast(features['label'], tf.int32)

  return image, label

然后它被读取和解码如下：

def convert_to(data_set, name):
  images = data_set.images
  labels = data_set.labels
  num_examples = data_set.num_examples

  if images.shape[0] != num_examples:
    raise ValueError('Images size %d does not match label size %d.' %
                     (images.shape[0], num_examples))
  rows = images.shape[1]
  cols = images.shape[2]
  depth = images.shape[3]

  filename = os.path.join(FLAGS.directory, name + '.tfrecords')
  print('Writing', filename)
  writer = tf.python_io.TFRecordWriter(filename)
  for index in range(num_examples):
    image_raw = images[index].tostring()
    example = tf.train.Example(features=tf.train.Features(feature={
        'height': _int64_feature(rows),
        'width': _int64_feature(cols),
        'depth': _int64_feature(depth),
        'label': _int64_feature(int(labels[index])),
        'image_raw': _bytes_feature(image_raw)}))
    writer.write(example.SerializeToString())
  writer.close()

def read_and_decode(filename_queue):
  reader = tf.TFRecordReader()
  _, serialized_example = reader.read(filename_queue)
  features = tf.parse_single_example(
      serialized_example,
      # Defaults are not specified since both keys are required.
      features={
          'image_raw': tf.FixedLenFeature([], tf.string),
          'label': tf.FixedLenFeature([], tf.int64),
      })

  # Convert from a scalar string tensor (whose single string has
  # length mnist.IMAGE_PIXELS) to a uint8 tensor with shape
  # [mnist.IMAGE_PIXELS].
  image = tf.decode_raw(features['image_raw'], tf.uint8)
  image.set_shape([mnist.IMAGE_PIXELS])

  # OPTIONAL: Could reshape into a 28x28 image and apply distortions
  # here.  Since we are not applying any distortions in this
  # example, and the next step expects the image to be flattened
  # into a vector, we don't bother.

  # Convert from [0, 255] -> [-0.5, 0.5] floats.
  image = tf.cast(image, tf.float32) * (1. / 255) - 0.5

  # Convert label from a scalar uint8 tensor to an int32 scalar.
  label = tf.cast(features['label'], tf.int32)

  return image, label

问：有没有办法从不同大小的TFR记录中读取图像？因为在这一点上

image.set_shape([mnist.IMAGE_PIXELS])

所有张量的大小都需要知道。这意味着我不能做类似的事情

width = tf.cast(features['width'], tf.int32)
height = tf.cast(features['height'], tf.int32) 
tf.reshape(image, [width, height, 3])

那么在这种情况下如何使用TFRecords呢？

此外，我也不明白为什么在教程中，作者在阅读和解码图像时不使用高度和宽度，而是使用预定义的常量，而将高度和宽度保存在TFRecords文件中。

对于这种特殊情况下的培训，没有理由保留宽度和高度，然而，由于图像在将来被序列化为单字节流，您可能会想知道数据最初是什么形状，而不是

字节-本质上，它们只是创建了自包含的示例

对于不同大小的图像，您必须记住，在某个时刻，您需要将特征张量映射到权重，并且由于给定网络的权重数量是固定的，因此必须是特征张量的维度。要考虑的另一点是数据规范化：如果使用不同形状的图像，它们是否具有相同的均值和方差？你可能会选择忽略这一点，但如果你不这样做，你也必须想出一个解决方案

如果您只是要求使用不同大小的图像，即

100x100x3

而不是

28x28x1

，您当然可以使用

image.set_shape([100, 100, 3])

为了将

30000的单个张量

“元素”总数重塑为单个秩3张量。或者，如果您正在处理批（大小待定），则可以使用

image_batch.set_shape([None, 100, 100, 3])

请注意，这不是一个张量列表，而是一个单秩4张量，因为该批次中的所有图像必须具有相同的维度；i、 e.不可能在同一批中先有

100x100x3

图像，然后再有

28x28x1

图像

在批处理之前，您可以自由选择所需的大小和形状，也可以从记录中加载形状，而在MNIST示例中没有这样做。例如，您可以应用任意一种方法，以获得固定大小的增强图像，以便进一步处理

还请注意，图像的序列化表示可能确实具有不同的长度和形状。例如，您可能决定存储而不是原始像素值；它们显然会有不同的尺寸

最后，还有

tf.FixedLenFeature（）

，但它们正在创建表示。这通常与非二进制图像无关