Python 模型如何知道我的数据被标记为什么？_Python_Tensorflow_Deep Learning

Python 模型如何知道我的数据被标记为什么？

python tensorflow deep-learning

Python 模型如何知道我的数据被标记为什么？,python,tensorflow,deep-learning,Python,Tensorflow,Deep Learning,我这里有一段代码：我想知道模型是如何知道图片的标签的我在代码中找不到任何标记函数，dog-cat数据目录中只包含了大量图像。我想在这里解决这个问题，以便将此模型用于不同的数据集。只是不知道如何给它贴标签编辑：更好的表达方式是：当每个测试、训练和验证目录都是随机的，并且在文件名中没有任何关于其标签的指示时，该脚本如何特别知道我的图像是如何标记的 #Importing a pre-trained network #Running from end to end with fine-tuning

我这里有一段代码：
我想知道模型是如何知道图片的标签的

我在代码中找不到任何标记函数，dog-cat数据目录中只包含了大量图像。我想在这里解决这个问题，以便将此模型用于不同的数据集。只是不知道如何给它贴标签

编辑：更好的表达方式是：当每个测试、训练和验证目录都是随机的，并且在文件名中没有任何关于其标签的指示时，该脚本如何特别知道我的图像是如何标记的

#Importing a pre-trained network
#Running from end to end with fine-tuning
from keras.applications import ResNet50
from keras import models
from keras import layers
from keras import optimizers
import matplotlib.pyplot as plt
from keras.preprocessing.image import ImageDataGenerator
import os
import numpy as np
import tensorflow as tf
#Fix memory growth issue
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)




#Summoning pretrained network
#Top is the classifier (Dense layer) which we want to change
#Using imagenet weights
conv_base = ResNet50(weights= 'imagenet',
                 include_top= False, 
                 input_shape= (150, 150, 3))


#Directories
test_dir = r''
train_dir = r'C:\Users\17574\Downloads\dogs-vs-cats\Training'
valid_dir = r'C:\Users\17574\Downloads\dogs-vs-cats\Validation'



#Network
model = models.Sequential()
model.add(conv_base)
model.add(layers.Flatten())
model.add(layers.Dense(256, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))


#Fine tuning
conv_base.trainable = True
#Go through each layer. If layer is conv1, set it to trainable. Rest false
set_trainable = False
for layer in conv_base.layers:
    if layer.name == 'block5_conv1':
        set_trainable = True
    if set_trainable:
        layer.trainable = True
    else:
        layer.trainable = False





#Data augmentation
train_datagen = ImageDataGenerator(
      rescale=1./255,
      rotation_range=40,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True,
      fill_mode='nearest')


# Note that the test data should not be augmented!
test_datagen = ImageDataGenerator(rescale=1./255)



#Generator that takes images and runs it through the data augmentation
train_generator = train_datagen.flow_from_directory(
        # This is the target directory
        train_dir,
        # All images will be resized to 150x150
        target_size=(150, 150),
        batch_size=20,
        # Since we use binary_crossentropy loss, we need binary labels
        class_mode='binary')


validation_generator = test_datagen.flow_from_directory(
        test_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')


#Use RMS prop to avoid doing big changes to weights
#We do not want to harm already trained weights too much
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(lr=1e-5),
              metrics=['acc'])

history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=100,
      validation_data=validation_generator,
      validation_steps=50)



history = model.fit_generator(
      train_generator,
      steps_per_epoch=100,
      epochs=30,
      validation_data=validation_generator,
      validation_steps=50,
      verbose=2)


#model.save('cats_and_dogs_small_4.h5')


acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(len(acc))

plt.plot(epochs, acc, 'bo', label='Training acc')
plt.plot(epochs, val_acc, 'b', label='Validation acc')
plt.title('Training and validation accuracy')
plt.legend()

plt.figure()

plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()

plt.show()



test_generator = test_datagen.flow_from_directory(
        test_dir,
        target_size=(150, 150),
        batch_size=20,
        class_mode='binary')

test_loss, test_acc = model.evaluate_generator(test_generator, steps=50)
print('test acc:', test_acc)

从：

参数

标签：要么是“推断”（标签从目录结构生成），要么是与目录中找到的图像文件数相同大小的整数标签列表/元组。标签应该按照图像文件路径的字母数字顺序排序（通过Python中的os.walk（目录）获得）

简而言之，如果目录结构为：

main_directory/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg

…然后从_目录（main_目录，labels='推断'）调用

image_dataset\u将返回子目录class_a
和class_b
中的图像数据集，以及标签0和1（0对应class_a，1对应class_b）
因此，标签是从文件夹中推断出来的。当然，您可以从以下位置指定要包含的文件夹。
：
参数
标签：要么是“推断”（标签从目录结构生成），要么是与目录中找到的图像文件数相同大小的整数标签列表/元组。标签应该按照图像文件路径的字母数字顺序排序（通过Python中的os.walk（目录）获得）
简而言之，如果目录结构为：
main_directory/
...class_a/
......a_image_1.jpg
......a_image_2.jpg
...class_b/
......b_image_1.jpg
......b_image_2.jpg

…然后从_目录（main_目录，labels='推断'）

调用

image_dataset\u将返回子目录class_a
和class_b
中的图像数据集，以及标签0和1（0对应class_a，1对应class_b）
因此，标签是从文件夹中推断出来的。当然，您可以指定要包括哪些文件夹。
让我们举个例子：

假设您从kaggle下载了dogVcat数据集。
因此，目录将是：
用于训练集

Downloads/dataset/training\u set/cats：这将标记为0，因为它是目录中的第一个文件夹。
Downloads/dataset/training\u set/cats：这将标记为1，因为它是目录中的第二个文件夹
用于验证集

下载/dataset/validation\u set/cats：这将标记为0。

下载/dataset/validation\u set/dogs：将标记为1
这一切都是通过ImageDataGenerator类中的函数flow from_directory（）完成的。
让我们举个例子：

假设您从kaggle下载了dogVcat数据集。
因此，目录将是：
用于训练集

Downloads/dataset/training\u set/cats：这将标记为0，因为它是目录中的第一个文件夹。
Downloads/dataset/training\u set/cats：这将标记为1，因为它是目录中的第二个文件夹
用于验证集

下载/dataset/validation\u set/cats：这将标记为0。

下载/dataset/validation\u set/dogs：将标记为1
这一切都是通过ImageDataGenerator类中存在的flow\u from\u directory（）函数完成的。
它根据文件夹名称和分类进行分类。它根据文件夹名称和分类进行分类。好的，那么您的意思是tensorflow会知道第一个文件夹将标记为0，第二个文件夹将标记为1，对吗？查看文档（对不起，我应该先这么做），我也可以洗牌我的数据。我会试试看，是的，你甚至可以用某种方法获得订单。好的，你是说tensorflow会知道第一个文件夹将标记为0，第二个文件夹将标记为1，对吗？查看文档（对不起，我应该先这么做），我也可以洗牌我的数据。我会试试的。是的，你甚至可以用某种方法得到订单。