Python Keras自动编码器输入图像大小

Python Keras自动编码器输入图像大小,python,keras,autoencoder,Python,Keras,Autoencoder,考虑这个自动编码器: 将numpy作为np导入 从keras.layers导入输入、稠密、Conv2D、MaxPoolig2D、UpSampling2D、展平、重塑 从keras.models导入模型 类自动编码器: 定义初始自身、图像大小、潜在尺寸: inp=Inputshape=image\u size[0],image\u size[1],1 x=Conv2D16,3,3,激活class='relu',填充class='same'inp x=MaxPooling2D2,2,padding=

考虑这个自动编码器:

将numpy作为np导入 从keras.layers导入输入、稠密、Conv2D、MaxPoolig2D、UpSampling2D、展平、重塑 从keras.models导入模型 类自动编码器: 定义初始自身、图像大小、潜在尺寸: inp=Inputshape=image\u size[0],image\u size[1],1 x=Conv2D16,3,3,激活class='relu',填充class='same'inp x=MaxPooling2D2,2,padding='same'x x=Conv2D8,3,3,激活='relu',填充='same'x x=MaxPooling2D2,2,padding='same'x x=Conv2D8,3,3,激活='relu',填充='same'x encoded=MaxPooling2D2,2,padding='same'x 此时,表示为4,4,8,即128维 d=Conv2D8,3,3,激活='relu',填充='same'编码 d=上采样2d2,2d d=Conv2D8,3,3,激活='relu',填充='same'd' d=上采样2d2,2d d=Conv2D16,3,3,激活 d=上采样2d2,2d 解码=Conv2D1,3,3,激活='sigmoid',填充='same'd' self.model=Modelinp,已解码 self.encoder=Modelinp,已编码 self.model.compileloss='mse',optimizer='Adam' printself.model.summary 我用

ConvAutoencoderimage_size=32,32,潜在_dim=10 哪张照片

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         (None, 32, 32, 1)         0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 32, 32, 16)        160       
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 16, 16, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 16, 16, 8)         1160      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 8, 8, 8)           0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 8, 8, 8)           584       
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 4, 4, 8)           0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 4, 4, 8)           584       
_________________________________________________________________
up_sampling2d_1 (UpSampling2 (None, 8, 8, 8)           0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 8, 8, 8)           584       
_________________________________________________________________
up_sampling2d_2 (UpSampling2 (None, 16, 16, 8)         0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 14, 14, 16)        1168      
_________________________________________________________________
up_sampling2d_3 (UpSampling2 (None, 28, 28, 16)        0         
_________________________________________________________________
conv2d_7 (Conv2D)            (None, 28, 28, 1)         145       
=================================================================
Total params: 4,385
Trainable params: 4,385
Non-trainable params: 0
_________________________________________________________________
None
如您所见,输入图像大小为32,32,但输出图像大小为28,28。 *问题1:如何更改自动编码器的体系结构,使输出图像大小变为32,32? *问题2:如您所见,该类需要一个名为潜伏期的参数。目前,此参数未使用。有没有一种简单的方法可以将自动编码器的潜在尺寸降低到一定的数值?例如在中间添加一个完全连接的层或沿着这些线的一些东西?

问题1

好吧,你忘记了上次上采样时的“相同”

应该是这样的

        # at this point the representation is (4, 4, 8) i.e. 128-dimensional

        d = Conv2D(8, (3, 3), activation='relu', padding='same')(encoded)
        d = UpSampling2D((2, 2))(d)
        d = Conv2D(8, (3, 3), activation='relu', padding='same')(d)
        d = UpSampling2D((2, 2))(d)
        d = Conv2D(16, (3, 3), activation='relu', padding='same')(d)
        d = UpSampling2D((2, 2))(d)

        decoded = Conv2D(1, (3, 3), activation='sigmoid', padding='same')(d)
问题2

你是说内核吗?那怎么办

        x = Conv2D(latent_dim*4, (3, 3), activation='relu', padding='same')(inp)
        x = MaxPooling2D((2, 2), padding='same')(x)
        x = Conv2D(latent_dim*2, (3, 3), activation='relu', padding='same')(x)
        x = MaxPooling2D((2, 2), padding='same')(x)
        x = Conv2D(latent_dim, (3, 3), activation='relu', padding='same')(x)
        encoded = MaxPooling2D((2, 2), padding='same')(x)
        # at this point the representation is (4, 4, 8) i.e. 128-dimensional

        d = Conv2D(latent_dim, (3, 3), activation='relu', padding='same')(encoded)
        d = UpSampling2D((2, 2))(d)
        d = Conv2D(latent_dim*2, (3, 3), activation='relu', padding='same')(d)
        d = UpSampling2D((2, 2))(d)
        d = Conv2D(latent_dim*4, (3, 3), activation='relu', padding='same')(d)
        d = UpSampling2D((2, 2))(d)
但是,如果您的意思是希望中间层具有特定的内核大小,那么可以将MaxPooling2D替换为Conv2D,如下所示

encoded = Conv2D(latent_dim, (3, 3), activation='relu', padding='same', strides=2)(x)
实际上,您可以删除所有MaxPoolig2D并将步幅=2添加到所有Conv2D