Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/tensorflow/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
tensorflow如何知道将数据的哪一部分分配给哪个子数据集?_Tensorflow_Keras - Fatal编程技术网

tensorflow如何知道将数据的哪一部分分配给哪个子数据集?

tensorflow如何知道将数据的哪一部分分配给哪个子数据集?,tensorflow,keras,Tensorflow,Keras,代码片段是从TensorFlow的教程网站()复制的。有两个代码块,一个用于train\u ds,另一个用于val\u ds。除了subset=参数之外,它们是相同的。我想知道TensorFlow是否将前80%的数据分配给train\u ds,将其余数据分配给val\u ds。如果没有,TensorFlow如何知道哪个零件分配给哪个零件?谢谢 train_ds = tf.keras.preprocessing.image_dataset_from_directory( data_dir,

代码片段是从TensorFlow的教程网站()复制的。有两个代码块,一个用于
train\u ds
,另一个用于
val\u ds
。除了subset=参数之外,它们是相同的。我想知道TensorFlow是否将前80%的数据分配给
train\u ds
,将其余数据分配给
val\u ds
。如果没有,TensorFlow如何知道哪个零件分配给哪个零件?谢谢

train_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2,
  subset="training",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
  data_dir,
  validation_split=0.2, #L: The same as above
  subset="validation",
  seed=123,
  image_size=(img_height, img_width),
  batch_size=batch_size
)
我们可以看一下

它基本上归结为这个函数:

def get_training_or_validation_split(samples, labels, validation_split, subset):
  """Potentially restict samples & labels to a training or validation split.
  Args:
    samples: List of elements.
    labels: List of corresponding labels.
    validation_split: Float, fraction of data to reserve for validation.
    subset: Subset of the data to return.
      Either "training", "validation", or None. If None, we return all of the
      data.
  Returns:
    tuple (samples, labels), potentially restricted to the specified subset.
  """
  if not validation_split:
    return samples, labels

  num_val_samples = int(validation_split * len(samples))
  if subset == 'training':
    print('Using %d files for training.' % (len(samples) - num_val_samples,))
    samples = samples[:-num_val_samples]
    labels = labels[:-num_val_samples]
  elif subset == 'validation':
    print('Using %d files for validation.' % (num_val_samples,))
    samples = samples[-num_val_samples:]
    labels = labels[-num_val_samples:]
  else:
    raise ValueError('`subset` must be either "training" '
                     'or "validation", received: %s' % (subset,))
  return samples, labels
训练集使用样本的第一部分,而验证集使用最后一部分