Python TFRecord编码嵌套对象_Python_Tensorflow

Python TFRecord编码嵌套对象

python tensorflow

Python TFRecord编码嵌套对象,python,tensorflow,Python,Tensorflow,我是Tensorflow的新手，我正在尝试将一个大型数据集分解为TFR记录。我正在编码的格式如下所示： ID（字符串，字节）索引（int64）时间（int64）图像（图像，字节）标签（标签列表，字节）标签对象具有FrameID（int64）、Category（int64）、x1（Float）、x2（Float）、y1（Float）、y2（Float）然而，我正在努力将这些信息序列化。我将标签列表分解为对应于对象属性的列表（即id[]，类别[…）目前，这是从TFRecord的文档

我是Tensorflow的新手，我正在尝试将一个大型数据集分解为TFR记录。我正在编码的格式如下所示：

ID（字符串，字节）
索引（int64）
时间（int64）
图像（图像，字节）
标签（标签列表，字节）

标签对象具有FrameID（int64）、Category（int64）、x1（Float）、x2（Float）、y1（Float）、y2（Float）然而，我正在努力将这些信息序列化。我将标签列表分解为对应于对象属性的列表（即id[]，类别[…）

目前，这是从TFRecord的文档页面采用的序列化单个元素的方式：

def _bytes_feature(value):
  """Returns a bytes_list from a string / byte."""
  if isinstance(value, type(tf.constant(0))):
    value = value.numpy() # BytesList won't unpack a string from an EagerTensor.
  return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

def _float_feature(value):
  """Returns a float_list from a float / double."""
  return tf.train.Feature(float_list=tf.train.FloatList(value=[value]))

def _float_list_feature(value):
  return tf.train.Feature(float_list=tf.train.FloatList(value=value))

def _int64_feature(value):
  """Returns an int64_list from a bool / enum / int / uint."""
  return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def _int64_list_feature(value):
  return tf.train.Feature(int64_list=tf.train.Int64List(value=value))

这就是数据写入tfrecords文件的方式

def serialize_header(feature0, feature1, feature2, feature3, feature4, feature5, feature6, feature7, feature8, feature9):
    """
    Creates a tf.train.Example message ready to be written to a file.
    """
    # Create a dictionary mapping the feature name to the tf.train.Example-compatible data type.
    feature = {
        'id': _bytes_feature(feature0),
        'index': _int64_feature(feature1),
        'time': _int64_feature(feature2),
        'image': _bytes_feature(feature3),
        'frame_id': _int64_list_feature(feature4),
        'category': _int64_list_feature(feature5),
        'x1': _float_list_feature(feature6),
        'x2': _float_list_feature(feature7),
        'y1': _float_list_feature(feature8),
        'y2': _float_list_feature(feature9)
    }
    # Create a Features message using tf.train.Example.
    example_proto = tf.train.Example(features=tf.train.Features(feature=feature))
    return example_proto.SerializeToString()

with tf.io.TFRecordWriter('test.tfrecords') as writer:
   result = serialize_header(b'TestID', 3, 4, open("b1c66a42-6f7d68ca.jpg", 'rb').read(), [3, 4], [1,2], [2.2, 3.3], [4.4, 5.5], [6.6, 7.7], [8.8, 9.9])
   print(result)
   writer.write(result)

到目前为止，一切进展顺利。直到我试图从数据集中读取数据时，我才陷入错误

raw_dataset = tf.data.TFRecordDataset('test.tfrecords')

# Create a dictionary describing the features.
feature_description = {
    'id': tf.io.FixedLenFeature([], tf.string),
    'index': tf.io.FixedLenFeature([], tf.int64),
    'time': tf.io.FixedLenFeature([], tf.int64),
    'image': tf.io.FixedLenFeature([], tf.string),
    'frame_id': tf.io.FixedLenFeature([], tf.int64),
    'category': tf.io.FixedLenFeature([], tf.int64),
    'x1': tf.io.FixedLenFeature([], tf.float32),
    'x2': tf.io.FixedLenFeature([], tf.float32),
    'y1': tf.io.FixedLenFeature([], tf.float32),
    'y2': tf.io.FixedLenFeature([], tf.float32)
}

def _parse_function(example_proto):
  # Parse the input tf.train.Example proto using the dictionary above.
  return tf.io.parse_single_example(example_proto, feature_description)

parsed_dataset = raw_dataset.map(_parse_function)
print(parsed_dataset)

for image_features in parsed_dataset:
  image_raw = image_features['id'].numpy()
  display(Image(data=image_raw))

其中错误为：

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-32-c5d6610d5b7f> in <module>()
     49 print(parsed_dataset)
     50 
---> 51 for image_features in parsed_dataset:
     52   image_raw = image_features['id'].numpy()
     53   display(Image(data=image_raw))
InvalidArgumentError: Key: y2.  Can't parse serialized Example.
     [[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]]

InvalidArgumentError回溯（最近一次调用上次）
在（）
49打印（已解析的_数据集）
50
--->51对于解析的_数据集中的图像_特征：
52 image_raw=图像特征['id'].numpy（）
53显示（图像（数据=图像_原始））
InvalidArgueMinterror:Key:y2。无法分析序列化的示例。
[{{node ParseSingleExample/ParseExample/ParseExampleV2}}]

我无法确定我是否正确地编码了数据，但解码错误，反之亦然，或者两者都是。如果有人在这方面有专业知识，那就太好了。

当使用

\u int64\u list\u功能创建时

\u float\u list\u功能创建时

而不是

FixedLenFeature（[]，tf.int64/tf.float32）

那就成功了！非常感谢你的帮助@请投赞成票，然后接受答案