Python 从_生成器()对tf.data.Dataset

Python 从_生成器()对tf.data.Dataset,python,tensorflow2.0,tensorflow2.x,Python,Tensorflow2.0,Tensorflow2.x,我有这个扩充代码: class CustomAugment(object): def __call__(self, sample): sample = self._random_apply(tf.image.flip_left_right, sample, p=0.5) sample = self._random_apply(self._color_jitter, sample, p=0.8) sample = self._rand

我有这个扩充代码:

class CustomAugment(object):
   def __call__(self, sample):        
       sample = self._random_apply(tf.image.flip_left_right, sample, p=0.5)
       sample = self._random_apply(self._color_jitter, sample, p=0.8)
       sample = self._random_apply(self._color_drop, sample, p=0.2)

       return sample

   def _color_jitter(self, x, s=1):
       
       x = tf.image.random_brightness(x, max_delta=0.8*s)
       x = tf.image.random_contrast(x, lower=1-0.8*s, upper=1+0.8*s)
       x = tf.image.random_saturation(x, lower=1-0.8*s, upper=1+0.8*s)
       x = tf.image.random_hue(x, max_delta=0.2*s)
       x = tf.clip_by_value(x, 0, 1)
       return x
   
   def _color_drop(self, x):
       x = tf.image.rgb_to_grayscale(x)
       x = tf.tile(x, [1, 1, 1, 3])
       return x
   
   def _random_apply(self, func, x, p):
       return tf.cond(
         tf.less(tf.random.uniform([], minval=0, maxval=1, dtype=tf.float32),
                 tf.cast(p, tf.float32)),
         lambda: func(x),
         lambda: x)
下面是我导入图像数据集的方法:

train\u ds=tf.data.Dataset.from\u生成器(路径)


我想在我的列车上应用这个扩展,那么,请告诉我,我该如何继续?

首先,您应该使用tf.keras.sequence的子类创建一个自定义生成器,然后您可以实现
\uuuuu getitem\uuuuuuu
\uu len\uuu
方法

class CustomGenerator(tf.keras.utils.Sequence):

    def __init__(self, df, X_col, y_col,
             batch_size,
             input_size=(width, height, channels),
             shuffle=True):
    
        self.df = df.copy()
        self.X_col = X_col
        self.y_col = y_col
        self.batch_size = batch_size
        self.input_size = input_size
    
        self.n = len(self.df)
        self.n_name = df[y_col['label']].nunique()
    
    def on_epoch_end(self):
        pass    
    
    def __getitem__(self, index):
        batches = self.df[index * self.batch_size:(index + 1) * 
                          self.batch_size]
        X, y = self.__get_data(batches)        
        return X, y
    
    def __len__(self):
        return self.n // self.batch_size
    
    def __get_output(self, label, num_classes):
        return tf.keras.utils.to_categorical(label, 
                                             num_classes=num_classes)    
    
    def __get_input(self, path, target_size):
        # Load Image using PIL
        img = Image.open(self.base_path + path)
        img = np.array(img)
        
        # Your Augmentation
        img = CustomAugment(img)
        return img /255


    def __get_data(self, batches):
        # Generates data containing batch_size samples

        img_path_batch = batches[self.X_col['img']]
        label_batch = batches[self.y_col['label']]

        X_batch = np.asarray([self.__get_input(x, self.input_size)
                              for x in img_path_batch])
        y_batch = np.asarray([self.__get_output(y)
                              for y in label_batch])

        return X_batch, y_batch
如您所见,您将在
\uu get\u input
方法中增加示例

要使用此类,请执行以下操作:

traingen = CustomDataGen(df, base_path=IMGS_DIR,
                     X_col={'img':'img'},
                     y_col={'label': 'label'},
                     max_label_len=11,
                     batch_size=16,
                     input_size=IMAGE_SIZE)
注意:如果需要在
tf.data
上使用生成器,则应如下使用:

train_dataset = tf.data.Dataset.from_generator(lambda: traingen,                                               
                                               output_types = (tf.float32, tf.int32),                                              
                                               output_shapes = ([None, width, height, channels], [None, num_classes]))

namererror:name'df'未定义
,为什么我会发现此错误?您应该创建一个包含两列的Pandas数据框:“img”和“label”img':img“标签”的名称:类的标签