Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 如何使用TensorFlow 2.0中的tf.data API在每个历元洗牌数据?_Python 3.x_Tensorflow2.0 - Fatal编程技术网

Python 3.x 如何使用TensorFlow 2.0中的tf.data API在每个历元洗牌数据?

Python 3.x 如何使用TensorFlow 2.0中的tf.data API在每个历元洗牌数据?,python-3.x,tensorflow2.0,Python 3.x,Tensorflow2.0,我正在用TensorFlow 2.0来训练我的模型。tf.dataAPI中的新迭代特性非常棒。然而,当我执行以下代码时,我发现,与torch.utils.data.DataLoader中的迭代功能不同,它不会在每个历元自动洗牌数据。如何使用TF2.0实现这一点 import numpy as np import tensorflow as tf def sample_data(): ... data = sample_data() NUM_EPOCHS = 10 BATCH_SIZE

我正在用TensorFlow 2.0来训练我的模型。
tf.data
API中的新迭代特性非常棒。然而,当我执行以下代码时,我发现,与
torch.utils.data.DataLoader
中的迭代功能不同,它不会在每个历元自动洗牌数据。如何使用TF2.0实现这一点

import numpy as np
import tensorflow as tf
def sample_data():
    ...

data = sample_data()

NUM_EPOCHS = 10
BATCH_SIZE = 128

# Subsample the data
mask = range(int(data.shape[0]*0.8), data.shape[0])
data_val = data[mask]
mask = range(int(data.shape[0]*0.8))
data_train = data[mask]

train_dset = tf.data.Dataset.from_tensor_slices(data_train).\
                                 shuffle(buffer_size=10000).\
                                repeat(1).batch(BATCH_SIZE)
val_dset = tf.data.Dataset.from_tensor_slices(data_val).\
                                 batch(BATCH_SIZE)


loss_metric = tf.keras.metrics.Mean(name='train_loss')
optimizer = tf.keras.optimizers.Adam(0.001)

@tf.function
def train_step(inputs):
    ...

for epoch in range(NUM_EPOCHS):
    # Reset the metrics
    loss_metric.reset_states()
    for inputs in train_dset:
        train_step(inputs)
    ...


该批次需要重新调整:

train_dset = tf.data.Dataset.from_tensor_slices(data_train).\
                                repeat(1).batch(BATCH_SIZE)

train_dset = train_dset.shuffle(buffer_size=buffer_size)