Tensorflow 形状与vgg16 keras不匹配:预期ndim=4,发现ndim=2,收到形状[无,无]

Tensorflow 形状与vgg16 keras不匹配:预期ndim=4,发现ndim=2,收到形状[无,无],tensorflow,keras,deep-learning,vgg-net,siamese-network,Tensorflow,Keras,Deep Learning,Vgg Net,Siamese Network,在尝试学习keras和deep learning时,我想创建一个图像抠图算法,该算法使用类似于修改后的自动编码器的体系结构,它接受两个图像输入(源图像和用户生成的trimap),并生成一个图像输出(图像前景的alpha值)。编码器部分(两个输入)使用预先训练的VGG16进行简单的特征提取。我想使用低分辨率alphamatting.com数据集训练解码器 运行附加的代码会产生错误: ValueError:层块1的输入0\u conv1与层不兼容:预期ndim=4,发现ndim=2。收到完整形状:[

在尝试学习keras和deep learning时,我想创建一个图像抠图算法,该算法使用类似于修改后的自动编码器的体系结构,它接受两个图像输入(源图像和用户生成的trimap),并生成一个图像输出(图像前景的alpha值)。编码器部分(两个输入)使用预先训练的VGG16进行简单的特征提取。我想使用低分辨率alphamatting.com数据集训练解码器

运行附加的代码会产生错误:
ValueError:层块1的输入0\u conv1与层不兼容:预期ndim=4,发现ndim=2。收到完整形状:[无,无]

我很难理解这个错误。我验证了我的twin_gen闭包正在为两个输入生成形状的图像批次(22,256256,3),因此我猜问题在于我以某种方式创建了错误的模型,但我不知道错误在哪里。有人能帮我解释一下我是如何看到这个错误的吗

import tensorflow as tf
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2DTranspose, Concatenate, BatchNormalization, Input
from tensorflow.keras.preprocessing.image import ImageDataGenerator


def DeConvBlock(input, num_output):
    x = Conv2DTranspose(num_output, kernel_size=3, strides=2, activation='relu', padding='same')(input)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    return x


img_input = Input((256, 256, 3))
img_vgg16 = VGG16(include_top=False, weights='imagenet')
img_vgg16._name = 'img_vgg16'
img_vgg16.trainable = False


tm_input = Input((256, 256, 3))
tm_vgg16 = VGG16(include_top=False, weights='imagenet')
tm_vgg16._name = 'tm_vgg16'
tm_vgg16.trainable = False

img_vgg16 = img_vgg16(img_input)
tm_vgg16 = tm_vgg16(tm_input)
x = Concatenate()([img_vgg16, tm_vgg16])
x = DeConvBlock(x, 512)
x = DeConvBlock(x, 256)
x = DeConvBlock(x, 128)
x = DeConvBlock(x, 64)
x = DeConvBlock(x, 32)
x = Conv2DTranspose(1, kernel_size=3, strides=1, activation='sigmoid', padding='same')(x)


m = Model(inputs=[img_input, tm_input], outputs=x)
m.summary()
m.compile(optimizer='adam', loss='mean_squared_error')

gen = ImageDataGenerator(width_shift_range=0.1, rotation_range=30, height_shift_range=0.1, horizontal_flip=True, validation_split=0.2, preprocessing_function=preprocess_input)
SEED = 49


def twin_gen(generator, subset):
    gen_img = generator.flow_from_directory('./data', classes=['input_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_map = generator.flow_from_directory('./data/trimap_training_lowres', classes=['Trimap1'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_truth = generator.flow_from_directory('./data', classes=['gt_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')

    while True:
        img = gen_img.__next__()
        tm = gen_map.__next__()
        gt = gen_truth.__next__()
        yield [[img, tm], gt]


train_gen = twin_gen(gen, 'training')
val_gen = twin_gen(gen, 'validation')


checkpoint_filepath = 'checkpoint'
checkpoint = tf.keras.callbacks.ModelCheckpoint(
    filepath=checkpoint_filepath,
    save_weights_only=True,
    monitor='val_loss',
    mode='auto',
    save_freq='epoch',
    save_best_only=True)


r = m.fit(train_gen, validation_data=val_gen, epochs=10, callbacks=[checkpoint])

首先,您没有指定
VGG16
的输入形状,并且设置了
include\u top=False
,因此对于
频道的最后一个
案例,默认输入形状将是
(无,无,3)

PS:您可以查看
keras.applications.VGG16
keras.applications.imagenet\utils.Acquisite\u input\u shape
的源代码以了解详细信息

通过调用
model.summary()
,可以看到输出
None
形状:

要解决此问题,只需在
VGG16
中设置
input\u shape=(256,256,3)
,然后调用
model.summary()
即可:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
input_1 (InputLayer)            [(None, 256, 256, 3) 0
__________________________________________________________________________________________________
input_3 (InputLayer)            [(None, 256, 256, 3) 0
__________________________________________________________________________________________________
img_vgg16 (Functional)          (None, 8, 8, 512)    14714688    input_1[0][0]
__________________________________________________________________________________________________
tm_vgg16 (Functional)           (None, 8, 8, 512)    14714688    input_3[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 8, 8, 1024)   0           img_vgg16[0][0]
                                                                 tm_vgg16[0][0]
__________________________________________________________________________________________________
            
错误的主要原因是,当您调用
\uuuuuuuuuunext()
时,它返回两个数组的元组
(数据,标签)
,其形状
((batch\u size,256,256,3),(batch\u size,1))
,但我们实际上只需要第一个数组

此外,数据生成器应生成
元组
列表
,否则将不会为任何变量提供梯度,因为
拟合
函数期望
(输入、目标)
作为数据生成器的返回

您还有另一个问题,即当您使用
颜色模式=/code>加载
gen\u truth
图像时,模型的输出形状是
(batch\u size,256,256,1)
,而
gen\u truth
元素形状是
(batch\u size,256,3)
,为了获得与模型输出相同的形状,如果你有灰度图像,你应该使用
color\u mode='grayscale'
加载
gen\u truth
,如果你想使用alpha值,你应该使用
color\u mode='rgba'
加载它,并获得最后一个通道值(我只是根据你问题中的描述猜测,但你应该知道)

运行时没有任何问题的示例代码:

import tensorflow as tf
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2DTranspose, Concatenate, BatchNormalization, Input
from tensorflow.keras.preprocessing.image import ImageDataGenerator

def DeConvBlock(input, num_output):
    x = Conv2DTranspose(num_output, kernel_size=3, strides=2, activation='relu', padding='same')(input)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    return x

img_input = Input((256, 256, 3))
img_vgg16 = VGG16(include_top=False, input_shape=(256, 256, 3), weights='imagenet')
img_vgg16._name = 'img_vgg16'
img_vgg16.trainable = False

tm_input = Input((256, 256, 3))
tm_vgg16 = VGG16(include_top=False, input_shape=(256, 256, 3), weights='imagenet')
tm_vgg16._name = 'tm_vgg16'
tm_vgg16.trainable = False

img_vgg16 = img_vgg16(img_input)
tm_vgg16 = tm_vgg16(tm_input)
x = Concatenate()([img_vgg16, tm_vgg16])
x = DeConvBlock(x, 512)
x = DeConvBlock(x, 256)
x = DeConvBlock(x, 128)
x = DeConvBlock(x, 64)
x = DeConvBlock(x, 32)
x = Conv2DTranspose(1, kernel_size=3, strides=1, activation='sigmoid', padding='same')(x)

m = Model(inputs=[img_input, tm_input], outputs=x)
m.summary()
m.compile(optimizer='adam', loss='mse')

gen = ImageDataGenerator(width_shift_range=0.1, rotation_range=30, height_shift_range=0.1, horizontal_flip=True, validation_split=0.2, preprocessing_function=preprocess_input)
SEED = 49

def twin_gen(generator, subset):
    gen_img = generator.flow_from_directory('./data', classes=['input_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_map = generator.flow_from_directory('./data/trimap_training_lowres', classes=['Trimap1'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_truth = generator.flow_from_directory('./data', classes=['gt_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='grayscale')

    while True:
        img = gen_img.__next__()[0]
        tm = gen_map.__next__()[0]
        gt = gen_truth.__next__()[0]
        yield ([img, tm], gt)

train_gen = twin_gen(gen, 'training')

r = m.fit(train_gen, steps_per_epoch=5, epochs=3)

嗯,您是对的,添加输入_形状使模型摘要看起来更正确,但不幸的是,我仍然得到相同的错误。我需要设置输入张量或类似的东西吗?那么您的数据生成器中也有问题,我将更新answer@ike我更新答案,检查您是否还有其他问题非常感谢,我相信这可以解决问题。我现在看到了CUDA的错误,但我认为这是一个完全独立的问题,我会尝试自己解决。再次感谢,现在接受你的答案。cuda的错误在我这边是可以解决的,所以我已经验证了你在这里的答案是正确的。再次感谢。
import tensorflow as tf
from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Conv2DTranspose, Concatenate, BatchNormalization, Input
from tensorflow.keras.preprocessing.image import ImageDataGenerator

def DeConvBlock(input, num_output):
    x = Conv2DTranspose(num_output, kernel_size=3, strides=2, activation='relu', padding='same')(input)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    x = Conv2DTranspose(num_output, kernel_size=3, strides=1, activation='relu', padding='same')(x)
    x = BatchNormalization()(x)
    return x

img_input = Input((256, 256, 3))
img_vgg16 = VGG16(include_top=False, input_shape=(256, 256, 3), weights='imagenet')
img_vgg16._name = 'img_vgg16'
img_vgg16.trainable = False

tm_input = Input((256, 256, 3))
tm_vgg16 = VGG16(include_top=False, input_shape=(256, 256, 3), weights='imagenet')
tm_vgg16._name = 'tm_vgg16'
tm_vgg16.trainable = False

img_vgg16 = img_vgg16(img_input)
tm_vgg16 = tm_vgg16(tm_input)
x = Concatenate()([img_vgg16, tm_vgg16])
x = DeConvBlock(x, 512)
x = DeConvBlock(x, 256)
x = DeConvBlock(x, 128)
x = DeConvBlock(x, 64)
x = DeConvBlock(x, 32)
x = Conv2DTranspose(1, kernel_size=3, strides=1, activation='sigmoid', padding='same')(x)

m = Model(inputs=[img_input, tm_input], outputs=x)
m.summary()
m.compile(optimizer='adam', loss='mse')

gen = ImageDataGenerator(width_shift_range=0.1, rotation_range=30, height_shift_range=0.1, horizontal_flip=True, validation_split=0.2, preprocessing_function=preprocess_input)
SEED = 49

def twin_gen(generator, subset):
    gen_img = generator.flow_from_directory('./data', classes=['input_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_map = generator.flow_from_directory('./data/trimap_training_lowres', classes=['Trimap1'], seed=SEED, shuffle=False, subset=subset, color_mode='rgb')
    gen_truth = generator.flow_from_directory('./data', classes=['gt_training_lowres'], seed=SEED, shuffle=False, subset=subset, color_mode='grayscale')

    while True:
        img = gen_img.__next__()[0]
        tm = gen_map.__next__()[0]
        gt = gen_truth.__next__()[0]
        yield ([img, tm], gt)

train_gen = twin_gen(gen, 'training')

r = m.fit(train_gen, steps_per_epoch=5, epochs=3)