Python 基于Tensorflow中TFR记录的分类器训练

Python 基于Tensorflow中TFR记录的分类器训练,python,tensorflow,tensorflow-datasets,Python,Tensorflow,Tensorflow Datasets,我已经有了一些从numpy数组训练分类器的代码。然而,我的训练数据集非常大。似乎建议的解决方案是使用TFRecords。我试图将TFRecords与我自己的数据集一起使用的尝试失败了,因此我已逐渐将我的代码缩减到最小 示例: import tensorflow as tf def readsingleexample(serialized): print("readsingleexample", serialized) feature = dict() feature['

我已经有了一些从numpy数组训练分类器的代码。然而,我的训练数据集非常大。似乎建议的解决方案是使用
TFRecords
。我试图将
TFRecords
与我自己的数据集一起使用的尝试失败了,因此我已逐渐将我的代码缩减到最小

示例:

import tensorflow as tf

def readsingleexample(serialized):
    print("readsingleexample", serialized)
    feature = dict()
    feature['x'] = tf.FixedLenFeature([], tf.int64)
    feature['label'] = tf.FixedLenFeature([], tf.int64)
    parsed_example = tf.parse_single_example(serialized, features=feature)
    print(parsed_example)
    return parsed_example['x'], parsed_example['label']

def TestParse(filename):
    record_iterator=tf.python_io.tf_record_iterator(path=filename)
    for string_record in record_iterator:
        example=tf.train.Example()
        example.ParseFromString(string_record)
        print(example.features)

def TestRead(filename):
    record_iterator=tf.python_io.tf_record_iterator(path=filename)
    for string_record in record_iterator:
        feats, label = readsingleexample(string_record)
        print(feats, label)

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def TFRecordsTest(filename):

    example=tf.train.Example(features=tf.train.Features(feature={
        'x': _int64_feature(7),
        'label': _int64_feature(4)
        }))
    writer = tf.python_io.TFRecordWriter(filename)
    writer.write(example.SerializeToString())

    record_iterator=tf.python_io.tf_record_iterator(path=filename)
    for string_record in record_iterator:
        example=tf.train.Example()
        example.ParseFromString(string_record)
        print(example.features)

    dataset=tf.data.TFRecordDataset(filenames=[filename])
    dataset=dataset.map(readsingleexample)
    dataset=dataset.repeat()

    def train_input_fn():
        iterator=dataset.make_one_shot_iterator()
        feats_tensor, labels_tensor = iterator.get_next()
        return {"x":feats_tensor}, labels_tensor

    feature_columns = []
    feature_columns.append(tf.feature_column.numeric_column(key='x'))

    classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                                      hidden_units=[10, 10, 10],
                                      n_classes=2)
    classifier.train(input_fn=train_input_fn, steps=1000)

    return
feature {
  key: "label"
  value {
    int64_list {
      value: 4
    }
  }
}
feature {
  key: "x"
  value {
    int64_list {
      value: 7
    }
  }
}

readsingleexample Tensor("arg0:0", shape=(), dtype=string)
{'x': <tf.Tensor 'ParseSingleExample/ParseSingleExample:1' shape=() dtype=int64>, 'label': <tf.Tensor 'ParseSingleExample/ParseSingleExample:0' shape=() dtype=int64>}
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\eeark\AppData\Local\Temp\tmpcl47b2ut
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    tfrecords_test.TFRecordsTest(fn)
  File "C:\_P4\user_feindselig\_python\tfrecords_test.py", line 60, in TFRecordsTest
    classifier.train(input_fn=train_input_fn, steps=1000)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\estimator.py", line 352, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\estimator.py", line 812, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\estimator.py", line 793, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\canned\dnn.py", line 354, in _model_fn
    config=config)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\canned\dnn.py", line 185, in _dnn_model_fn
    logits = logit_fn(features=features, mode=mode)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\canned\dnn.py", line 91, in dnn_logit_fn
    features=features, feature_columns=feature_columns)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 273, in input_layer
    trainable, cols_to_vars)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 198, in _internal_input_layer
    trainable=trainable)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 2080, in _get_dense_tensor
    return inputs.get(self)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 1883, in get
    transformed = column._transform_feature(self)  # pylint: disable=protected-access
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 2048, in _transform_feature
    input_tensor = inputs.get(self.key)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 1870, in get
    feature_tensor = self._get_raw_feature_as_tensor(key)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 1924, in _get_raw_feature_as_tensor
    key, feature_tensor))
ValueError: Feature (key: x) cannot have rank 0. Give: Tensor("IteratorGetNext:0", shape=(), dtype=int64, device=/device:CPU:0)
这将产生以下输出:

import tensorflow as tf

def readsingleexample(serialized):
    print("readsingleexample", serialized)
    feature = dict()
    feature['x'] = tf.FixedLenFeature([], tf.int64)
    feature['label'] = tf.FixedLenFeature([], tf.int64)
    parsed_example = tf.parse_single_example(serialized, features=feature)
    print(parsed_example)
    return parsed_example['x'], parsed_example['label']

def TestParse(filename):
    record_iterator=tf.python_io.tf_record_iterator(path=filename)
    for string_record in record_iterator:
        example=tf.train.Example()
        example.ParseFromString(string_record)
        print(example.features)

def TestRead(filename):
    record_iterator=tf.python_io.tf_record_iterator(path=filename)
    for string_record in record_iterator:
        feats, label = readsingleexample(string_record)
        print(feats, label)

def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def TFRecordsTest(filename):

    example=tf.train.Example(features=tf.train.Features(feature={
        'x': _int64_feature(7),
        'label': _int64_feature(4)
        }))
    writer = tf.python_io.TFRecordWriter(filename)
    writer.write(example.SerializeToString())

    record_iterator=tf.python_io.tf_record_iterator(path=filename)
    for string_record in record_iterator:
        example=tf.train.Example()
        example.ParseFromString(string_record)
        print(example.features)

    dataset=tf.data.TFRecordDataset(filenames=[filename])
    dataset=dataset.map(readsingleexample)
    dataset=dataset.repeat()

    def train_input_fn():
        iterator=dataset.make_one_shot_iterator()
        feats_tensor, labels_tensor = iterator.get_next()
        return {"x":feats_tensor}, labels_tensor

    feature_columns = []
    feature_columns.append(tf.feature_column.numeric_column(key='x'))

    classifier = tf.estimator.DNNClassifier(feature_columns=feature_columns,
                                      hidden_units=[10, 10, 10],
                                      n_classes=2)
    classifier.train(input_fn=train_input_fn, steps=1000)

    return
feature {
  key: "label"
  value {
    int64_list {
      value: 4
    }
  }
}
feature {
  key: "x"
  value {
    int64_list {
      value: 7
    }
  }
}

readsingleexample Tensor("arg0:0", shape=(), dtype=string)
{'x': <tf.Tensor 'ParseSingleExample/ParseSingleExample:1' shape=() dtype=int64>, 'label': <tf.Tensor 'ParseSingleExample/ParseSingleExample:0' shape=() dtype=int64>}
WARNING:tensorflow:Using temporary folder as model directory: C:\Users\eeark\AppData\Local\Temp\tmpcl47b2ut
Traceback (most recent call last):
  File "<pyshell#2>", line 1, in <module>
    tfrecords_test.TFRecordsTest(fn)
  File "C:\_P4\user_feindselig\_python\tfrecords_test.py", line 60, in TFRecordsTest
    classifier.train(input_fn=train_input_fn, steps=1000)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\estimator.py", line 352, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\estimator.py", line 812, in _train_model
    features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\estimator.py", line 793, in _call_model_fn
    model_fn_results = self._model_fn(features=features, **kwargs)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\canned\dnn.py", line 354, in _model_fn
    config=config)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\canned\dnn.py", line 185, in _dnn_model_fn
    logits = logit_fn(features=features, mode=mode)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\estimator\canned\dnn.py", line 91, in dnn_logit_fn
    features=features, feature_columns=feature_columns)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 273, in input_layer
    trainable, cols_to_vars)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 198, in _internal_input_layer
    trainable=trainable)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 2080, in _get_dense_tensor
    return inputs.get(self)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 1883, in get
    transformed = column._transform_feature(self)  # pylint: disable=protected-access
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 2048, in _transform_feature
    input_tensor = inputs.get(self.key)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 1870, in get
    feature_tensor = self._get_raw_feature_as_tensor(key)
  File "C:\Program Files\Python352\lib\site-packages\tensorflow\python\feature_column\feature_column.py", line 1924, in _get_raw_feature_as_tensor
    key, feature_tensor))
ValueError: Feature (key: x) cannot have rank 0. Give: Tensor("IteratorGetNext:0", shape=(), dtype=int64, device=/device:CPU:0)
功能{
关键词:“标签”
价值观{
int64_列表{
价值:4
}
}
}
特征{
键:“x”
价值观{
int64_列表{
价值:7
}
}
}
readsingleexample张量(“arg0:0”,shape=(),dtype=string)
{'x':,'label':}
警告:tensorflow:使用临时文件夹作为模型目录:C:\Users\eeark\AppData\Local\Temp\tmpcl47b2ut
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
tfrecords\u test.TFRecordsTest(fn)
TFRecordsTest中第60行的文件“C:\\u P4\user\u feindselig\\u python\tfrecords\u test.py”
分类器.train(输入\u fn=train\u输入\u fn,步数=1000)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\estimator\estimator.py”,第352行,列车中
损失=自我训练模型(输入、挂钩、保存侦听器)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\estimator\estimator.py”,第812行,在列车模型中
功能、标签、型号(fn_lib.ModeKeys.TRAIN、self.config)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\estimator\estimator.py”,第793行,在调用模型中
模型\结果=自身。\模型\结果(特征=特征,**kwargs)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\estimator\canted\dnn.py”,第354行,在\u model\u fn中
config=config)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\estimator\canted\dnn.py”,第185行,在\u dnn\u model\u fn中
logits=logit\u fn(特征=特征,模式=模式)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\estimator\canted\dnn.py”,第91行,dnn\u logit\u fn
要素=要素,要素列=要素列)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\feature\u column\feature\u column.py”,第273行,输入层
可培训,cols_至vars)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\feature\u column\feature\u column.py”,第198行,在\u internal\u input\u层
可培训的
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\feature\u column\feature\u column.py”,第2080行,在\u get\u densite\u tensor中
返回输入。获取(self)
get中第1883行的文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\feature\u column\feature\u column.py”
转换=列。_转换_功能(自)35; pylint:disable=受保护的访问
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\feature\u column\feature\u column.py”,第2048行,在转换功能中
输入\张量=inputs.get(self.key)
get中的文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\feature\u column\feature\u column.py”,第1870行
特征张量=自我。获取原始特征作为张量(键)
文件“C:\Program Files\Python352\lib\site packages\tensorflow\python\feature\u column\feature\u column.py”,第1924行,位于“get\u raw\u feature\u as\u tensor”中
键,特征(张量)
ValueError:功能(键:x)不能具有等级0。给定:张量(“IteratorGetNext:0”,shape=(),dtype=int64,device=/device:CPU:0)

这个错误意味着什么?可能出了什么问题?

以下似乎有效:至少没有出现错误
tf.parse_示例([serialized],…)
用于代替
tf.parse_单个示例(serialized,…)
。(同时,合成数据中的标签被修改为小于类数。)


等级0意味着它是一个标量

所以


将其设为秩1或向量,即add[]

不幸的是,这不起作用:TypeError:Parameter to MergeFrom()必须是同一类的实例:预期的功能已获取列表。这为我解决了这个问题。我不知道为什么在TF文档中,当每个人都在批处理东西时,他们会使用
parse\u single\u示例