Python AutoKeras图像分类器:生成器不工作,plt.show()提供空图像

Python AutoKeras图像分类器:生成器不工作,plt.show()提供空图像,python,tensorflow,classification,automl,auto-keras,Python,Tensorflow,Classification,Automl,Auto Keras,我正在尝试使用AutoKeras、Tensorflow和Pandas构建一个图像分类程序 代码如下: from keras_preprocessing.image import ImageDataGenerator import autokeras as ak import pandas as pd import matplotlib.pyplot as plt import tensorflow as tf # directory with subfolders (that contain

我正在尝试使用AutoKeras、Tensorflow和Pandas构建一个图像分类程序

代码如下:

from keras_preprocessing.image import ImageDataGenerator
import autokeras as ak
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf

# directory with subfolders (that contain other subfolders) that contain images
data_dir = "/home/jack/project/"

# dataframe initialization
dataframe = pd.read_excel("/home/jack/project/pathsandlabels.xlsx")

# splitting the dataset
train_dataframe = dataframe.sample(frac=0.75, random_state=200)
test_dataframe = dataframe.drop(train_dataframe.index)

# Augmenting it
datagen = ImageDataGenerator(rescale=1./255., horizontal_flip=True, shear_range=0.6, zoom_range=0.4,
                         validation_split=0.25)

# Setting up a train generator
train_generator = datagen.flow_from_dataframe(
dataframe=train_dataframe,
directory="/home/jack/project",
x_col="filename",
y_col="assessment",
subset="training",
seed=42,
batch_size=16,
shuffle=True,
class_mode="binary",
target_size=(224, 224)
)


# setting up a validation generator
validation_generator = datagen.flow_from_dataframe(
dataframe=train_dataframe,
directory="/home/jack/project/",
x_col="filename",
y_col="assessment",
subset="validation",
batch_size=16,
seed=42,
shuffle=True,
class_mode="binary",
target_size=(224, 224)
)

# Another augmentation but for test data
test_gen = ImageDataGenerator(rescale=1./255.)

# test generator set up
test_generator = test_gen.flow_from_dataframe(
dataframe=test_dataframe,
directory="/home/jack/project/",
x_col="filename",
y_col=None,
batch_size=16,
seed=42,
shuffle=False,
class_mode=None,
target_size=(224, 224)
)


# this function will yield the variables we need to work with in order to create a train and test set
# it will iterate through the generator
def my_iterator(generator):
    for img_batch, targets_batch in generator:
        yield test_generator.batch_size, targets_batch


# Train and Validation set creation
# The first problem is here
# 1: Invalid argument: Value Error: 'generator' yielded an element of shape (16,224,224,3) where an element
# of shape (224,) was expected.
train_set = tf.data.Dataset.from_generator(lambda: my_iterator(train_generator), output_shapes=(224, 244),
                                       output_types=(tf.float32, tf.float32))

val_set = tf.data.Dataset.from_generator(lambda: my_iterator(validation_generator), output_shapes=(224, 224),
                                     output_types=(tf.float32, tf.float32))

# we check the output of both validation and train sets
print(train_set)
print(val_set)

# This piece of code is where the other two issues are:
# 2: squeeze(axis=2) gives this error: ValueError: cannot select an axis to squeeze out which has size not equal to one 
# 3: Issue 2 can be averted by setting axis=None, but the next problem is plt.show() gives an empty image. 
for image, label in train_set.take(1):
    print("Image shape: ", image.numpy.shape())
    print("Label: ", label.numpy.shape())
    plt.imshow(image.numpy()[0].squeeze(axis=2) * 255)
    plt.show()

clf = ak.ImageClassifier(overwrite=True, max_trials=1, seed=5)
clf.fit(x=train_set, epochs=20)
print(clf.evaluate(val_set))
我在代码中以注释的形式提到了我面临的问题,但我将再次解释

最大的问题是第一个:值错误:“生成器”生成了一个shape元素(16224224,3),其中shape元素(224,)是预期的。当我尝试初始化测试集时,就会发生这种情况

我尝试的是:

  • 将输出_形状更改为(224224,3)和(16224224,3)(没有帮助,引发了另一个错误,表示“两个序列的长度不相同”
  • 从train_generator中删除批次大小(这会将其设置回默认的32,我的电脑无法处理)
  • 将生成器中的目标_大小更改为(224224,3)和(16224224,3)。无效
  • 更改my_迭代器生成的变量数。不起作用(错误消息:Expectn n(这是3或4)个要解压缩的值,得到2)
  • 将批处理大小更改为图像总数可以除以的数字(不起作用,抛出原始错误消息)
  • 数据的存储方式: Excel.Single sheet.A和B两列。文件名和评估是列名。文件名是指向图像的路径(例如“/subfolder/subfolder/subfolder/subfolder/A2c3jc3291n.jpeg”),但明显没有引号。
    评估是类。在这种情况下只有两个。

    这是一种使用目录迭代器的极其复杂的方法。你可以用4-5行代码来完成,比如。通过小的编辑,你可以让它从数据帧中获取文件名。好的,我来试一试。谢谢。