Python 只考虑连续数据行_Python_Tensorflow_Tensorflow Datasets

Python 只考虑连续数据行

python tensorflow

Python 只考虑连续数据行,python,tensorflow,tensorflow-datasets,Python,Tensorflow,Tensorflow Datasets,我有一个由时间戳和一些其他数据字段组成的数据。然而，我的一些条目不考虑用于学习，因此我已将其从数据中删除。我最终得到了这样一个数据集： 1 2 3 4 5 7 8 9 10 （注意6处的间隙）。现在，为了学习，我希望TysFooad考虑最后3行（并且从下面的行获得预测标签），但是考虑到这些间隙。在我的示例中，有效的数据包是（1,2,3,4）、（2,3,4,5）和（7,8,9,10），但不是例如（3,4,5,7）我已经研究了Tensorflow API，似乎一个自己的实现就可以做到这一点，尽管

我有一个由时间戳和一些其他数据字段组成的数据。然而，我的一些条目不考虑用于学习，因此我已将其从数据中删除。我最终得到了这样一个数据集：

（注意6处的间隙）。现在，为了学习，我希望TysFooad考虑最后3行（并且从下面的行获得预测标签），但是考虑到这些间隙。在我的示例中，有效的数据包是（1,2,3,4）、（2,3,4,5）和（7,8,9,10），但不是例如（3,4,5,7）

我已经研究了Tensorflow API，似乎一个自己的实现就可以做到这一点，尽管乍一看，该类看起来不像是这种方法的自然候选（例如，没有抽象的超类，其中只需要实现一些微小的

next（）

方法；-）

还有其他想法吗？您将如何解决这个问题？

我认为最简单的方法是使用

tf.data.Dataset

API的窗口功能，并过滤相关的值

例如，如果重用您的示例：

# creating a dataset of the values 1 to 10
ds = tf.data.Dataset.range(1,11)
# elements that we don't want in the dataset
to_remove = tf.constant([6])
# creating windows of size 4 with a shfit of 1. We keep only windows of size 4
windows = ds.window(size=4, shift=1, drop_remainder=True)
# window returns a Dataset of Dataset, we flatten it to get a Dataset of Tensor
windows = windows.flat_map(lambda window: window.batch(4, drop_remainder=True))
# we filter to keep only the correct elements
filtered = windows.filter(lambda x: not tf.reduce_any(tf.equal(x,to_remove[:,tf.newaxis])))

如果我们查看最终数据集：

>>> for data in filtered:
        print(data)
tf.Tensor([1 2 3 4], shape=(4,), dtype=int32)
tf.Tensor([2 3 4 5], shape=(4,), dtype=int32)
tf.Tensor([ 7  8  9 10], shape=(4,), dtype=int32)

但是，我的一些条目不考虑用于学习，因此我已将它们从数据中删除了

：您是否仍然拥有它们，或者它们确实已经消失了？我仍然拥有它们（并且可以通过数据很容易地看到它们）…您的答案可能已经满足了我的需要，虽然我在把它应用到我的具体环境中有些困难。。。我发现调试Tensorflow代码相当困难。过滤部分可能是最困难的部分。如果需要，请毫不犹豫地问另一个更具体的问题。