Python 从_生成器`停止TensorFlow数据集`的正确方法？_Python_Tensorflow_Tensorflow Datasets

Python 从_生成器`停止TensorFlow数据集`的正确方法？

python tensorflow

Python 从_生成器`停止TensorFlow数据集`的正确方法？,python,tensorflow,tensorflow-datasets,Python,Tensorflow,Tensorflow Datasets,我想使用一个TensorFlow数据集来访问格式化的文件，该数据集是使用来自\u generator的构建的。除了我不知道当生成器的数据用完时如何停止Dataset迭代器（当您超出范围时，生成器只会永远返回空列表）之外，大部分工作都正常我的实际代码非常复杂，但我可以用这个简短的程序模拟这种情况： import tensorflow as tf def make_batch_generator_fn(batch_size=10, dset_size=100): feats, targs

我想使用一个TensorFlow数据集来访问格式化的文件，该数据集是使用来自\u generator的

构建的。除了我不知道当生成器的数据用完时如何停止Dataset迭代器（当您超出范围时，生成器只会永远返回空列表）之外，大部分工作都正常
我的实际代码非常复杂，但我可以用这个简短的程序模拟这种情况：
import tensorflow as tf

def make_batch_generator_fn(batch_size=10, dset_size=100):
    feats, targs = range(dset_size), range(1, dset_size + 1)

    def batch_generator_fn():
        start_idx, stop_idx = 0, batch_size
        while True:
            # if stop_idx > dset_size: --- stop action?
            yield feats[start_idx: stop_idx], targs[start_idx: stop_idx]
            start_idx, stop_idx = start_idx + batch_size, stop_idx + batch_size

    return batch_generator_fn

def test(batch_size=10):
    dgen = make_batch_generator_fn(batch_size)
    features_shape, targets_shape = [None], [None]
    ds = tf.data.Dataset.from_generator(
        dgen, (tf.int32, tf.int32),
        (tf.TensorShape(features_shape), tf.TensorShape(targets_shape))
    )
    feats, targs = ds.make_one_shot_iterator().get_next()

    with tf.Session() as sess:
        counter = 0
        try:
            while True:
                f, t = sess.run([feats, targs])
                print(f, t)
                counter += 1
                if counter > 15:
                    break
        except tf.errors.OutOfRangeError:
            print('end of dataset at counter = {}'.format(counter))

if __name__ == '__main__':
    test()

如果我事先知道记录的数量，我可以调整批次的数量，但我并不总是知道。我已经试着在上面的代码片段中添加一些代码，其中有一行注释，如stop action？
。特别是，我曾尝试提出一个索引器
，但TensorFlow不喜欢这样，即使我在执行代码中显式地捕获它。我还尝试提出了一个tf.errors.OutOfRangeError
，但我不确定如何实例化它：构造函数需要三个参数--“node_def”、“op”和“message”，我不太确定“node_def”和“op”通常使用什么
如果您对这个问题有任何想法或意见，我将不胜感激。谢谢
 满足停止条件时返回：
def make_batch_generator_fn(batch_size=10, dset_size=100):
    feats, targs = range(dset_size), range(1, dset_size + 1)

    def batch_generator_fn():
        start_idx, stop_idx = 0, batch_size
        while True:
            if stop_idx > dset_size:
                return
            else:
                yield feats[start_idx: stop_idx], targs[start_idx: stop_idx]
                start_idx, stop_idx = start_idx + batch_size, stop_idx + batch_size

    return batch_generator_fn

这符合中指定的行为
在生成器函数中，return语句表示生成器已完成，并将引发StopIteration。返回值（如果有）用作构造StopIteration的参数，并成为StopIteration.value属性
它使用以下行：
dataset_size = your dataset size
batch_size = your batch size
dataset = your tf.data.Dataset
steps_per_epoch = dataset_size // batch_size

for data, _ in zip(dataset, range(steps_per_epoch)):
    # your train_step

迭代完成后将停止。
@M-Chen-3嗨，我已经更新了代码，我想它会解释自己：）