Machine learning 用Optuna优化CNN滤波器尺寸

Machine learning 用Optuna优化CNN滤波器尺寸,machine-learning,deep-learning,conv-neural-network,optuna,Machine Learning,Deep Learning,Conv Neural Network,Optuna,我创建了一个CNN,根据大小为39 x 39的输入图像对三个类别进行分类。我正在使用Optuna优化网络参数。对于Optuna,我将定义以下参数进行优化: num_blocks = trial.suggest_int('num_blocks', 1, 4) num_filters = [int(trial.suggest_categorical("num_filters", [32, 64, 128, 256]))] kernel_size = trial.suggest_i

我创建了一个CNN,根据大小为
39 x 39
的输入图像对三个类别进行分类。我正在使用Optuna优化网络参数。对于Optuna,我将定义以下参数进行优化:

num_blocks = trial.suggest_int('num_blocks', 1, 4)
num_filters = [int(trial.suggest_categorical("num_filters", [32, 64, 128, 256]))]
kernel_size = trial.suggest_int('kernel_size', 2, 7)
num_dense_nodes = trial.suggest_categorical('num_dense_nodes', [64, 128, 256, 512, 1024])
dense_nodes_divisor = trial.suggest_categorical('dense_nodes_divisor', [1, 2, 4, 8])
batch_size = trial.suggest_categorical('batch_size', [16, 32, 64, 128])
drop_out = trial.suggest_discrete_uniform('drop_out', 0.05, 0.5, 0.05)
lr = trial.suggest_loguniform('lr', 1e-6, 1e-1)

dict_params = {'num_blocks': num_blocks,
 'num_filters': num_filters,
 'kernel_size': kernel_size,
 'num_dense_nodes': num_dense_nodes,
 'dense_nodes_divisor': dense_nodes_divisor,
 'batch_size': batch_size,
 'drop_out': drop_out,
 'lr': lr}
我的网络如下所示:

input_tensor = Input(shape=(39,39,3))

# 1st cnn block
x = Conv2D(filters=dict_params['num_filters'],
 kernel_size=dict_params['kernel_size'],
 strides=1, padding='same')(input_tensor)
x = BatchNormalization()(x, training=training)
x = Activation('relu')(x)
x = MaxPooling2D(padding='same')(x)
x = Dropout(dict_params['drop_out'])(x)

# additional cnn blocks
for i in range(1, dict_params['num_blocks']):
    x = Conv2D(filters=dict_params['num_filters']*(2**i), kernel_size=dict_params['kernel_size'], strides=1, padding='same')(x)
    x = BatchNormalization()(x, training=training)
    x = Activation('relu')(x)
    x = MaxPooling2D(padding='same')(x)
    x = Dropout(dict_params['drop_out'])(x)

# mlp
x = Flatten()(x)
x = Dense(dict_params['num_dense_nodes'], activation='relu')(x)
x = Dropout(dict_params['drop_out'])(x)
x = Dense(dict_params['num_dense_nodes'] // dict_params['dense_nodes_divisor'], activation='relu')(x)
output_tensor = Dense(self.number_of_classes, activation='softmax')(x)

# instantiate and compile model
cnn_model = Model(inputs=input_tensor, outputs=output_tensor)
opt = Adam(lr=dict_params['lr'])
loss = 'categorical_crossentropy'
cnn_model.compile(loss=loss, optimizer=opt, metrics=['accuracy',  tf.keras.metrics.AUC()])
我正在使用Optuna优化(最小化)验证损失。网络中最多有4个块,每个块的过滤器数量加倍。这意味着,例如,第一个块中有64个,第二个块中有128个,第三个块中有256个,依此类推。有两个问题。首先,当我们开始使用例如256个过滤器和总共4个块时,在最后一个块中将有2048个过滤器,这太多了

是否可以使
num_filters
参数依赖于
num_blocks
参数?这意味着,如果有更多的块,开始过滤器的大小应该更小。因此,例如,如果将
num_blocks
选择为4,
num_filters
应仅从32、64和128中采样

其次,我认为将过滤器大小增加一倍是很常见的,但也有一些网络在最大池层(类似于VGG)之前具有恒定的过滤器大小或两个卷积(具有相同数量的过滤器),等等。是否可以调整Optuna优化以涵盖所有这些变化