采用多层Keras模型作为卷积滤波器

采用多层Keras模型作为卷积滤波器,keras,keras-layer,Keras,Keras Layer,我在现有模型的最后一层有一个两层双向循环堆栈,该堆栈经过训练,可以从恒定大小的图像中提取文本。在循环堆栈之前,我在现有体系结构中添加了一个额外的卷积层,以便在更大的图像上实现文本检测。模型如下所示: Layer (type) Output Shape Param # ================================================================= the_input (InputLayer

我在现有模型的最后一层有一个两层双向循环堆栈,该堆栈经过训练,可以从恒定大小的图像中提取文本。在循环堆栈之前,我在现有体系结构中添加了一个额外的卷积层,以便在更大的图像上实现文本检测。模型如下所示:

Layer (type)                 Output Shape              Param #
=================================================================
the_input (InputLayer)       (None, 64, 128, 3)        0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 64, 128, 48)       3648
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 32, 64, 48)        0
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 32, 64, 64)        76864
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 16, 64, 64)        0
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 16, 64, 128)       204928
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 8, 32, 128)        0
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 1, 1, 2048)        67110912
_________________________________________________________________
lambda_1 (Lambda)            (None, 1, 1, 1)           0
=================================================================
conv2d_4是一种新的卷积层,用于启用大小不同于128 x 64的图像。lambda_1是一个存在指示器变量,每个滑动窗口位置有一个值。创建以下模型的代码如下所示:

input_data = Input(name='the_input', shape=input_shape)

def add_conv(prev_component, kernel_dims, num_filters, max_pool_dims=None, padding='same'):
    cnv = Conv2D(num_filters, kernel_dims, activation='relu',
                 padding=padding)(prev_component)
    if max_pool_dims is not None:
        cnv = MaxPooling2D(pool_size=max_pool_dims)(cnv)
    return cnv

prev = input_data
prev = add_conv(prev, (5, 5), 48, max_pool_dims=(2, 2))
prev = add_conv(prev, (5, 5), 64, max_pool_dims=(2, 1))
prev = add_conv(prev, (5, 5), 128, max_pool_dims=(2, 2))
#[kernel_height, kernel_width, prev_filter_count, new_filter_count]
prev = add_conv(prev, (8, 32), time_dense_size, padding='valid')

def is_plate_func(windowed_data):
    is_plate_w = K.variable(K.truncated_normal(stddev=0.1, shape=(1, 1, 2048, 1)))
    is_plate_b = K.variable(K.constant(.1, shape=[1]))
    is_plate_out = K.bias_add(K.conv2d(windowed_data, is_plate_w), is_plate_b)
    return is_plate_out

is_plate_lambda = Lambda(is_plate_func)(prev)
Model(inputs=input_data, outputs=[is_plate_lambda]).summary()
bdrnn_model = Sequential()
bdrnn_model.add(Reshape((2048, 1), input_shape=(1, 1, 2048)))
for idx in range(num_backward_layers):
    if idx == num_backward_layers - 1:
        merge_mode = 'concat'
    else:
        merge_mode = 'sum'
    bdrnn_model.add(Bidirectional(GRU(rnn_size, return_sequences=True),
                                  merge_mode=merge_mode))
bdrnn_model.add(Dense(num_output_characters + 2, activation='relu'))
我希望以与lambda_1相同的方式附加到模型的循环堆栈的代码如下所示:

input_data = Input(name='the_input', shape=input_shape)

def add_conv(prev_component, kernel_dims, num_filters, max_pool_dims=None, padding='same'):
    cnv = Conv2D(num_filters, kernel_dims, activation='relu',
                 padding=padding)(prev_component)
    if max_pool_dims is not None:
        cnv = MaxPooling2D(pool_size=max_pool_dims)(cnv)
    return cnv

prev = input_data
prev = add_conv(prev, (5, 5), 48, max_pool_dims=(2, 2))
prev = add_conv(prev, (5, 5), 64, max_pool_dims=(2, 1))
prev = add_conv(prev, (5, 5), 128, max_pool_dims=(2, 2))
#[kernel_height, kernel_width, prev_filter_count, new_filter_count]
prev = add_conv(prev, (8, 32), time_dense_size, padding='valid')

def is_plate_func(windowed_data):
    is_plate_w = K.variable(K.truncated_normal(stddev=0.1, shape=(1, 1, 2048, 1)))
    is_plate_b = K.variable(K.constant(.1, shape=[1]))
    is_plate_out = K.bias_add(K.conv2d(windowed_data, is_plate_w), is_plate_b)
    return is_plate_out

is_plate_lambda = Lambda(is_plate_func)(prev)
Model(inputs=input_data, outputs=[is_plate_lambda]).summary()
bdrnn_model = Sequential()
bdrnn_model.add(Reshape((2048, 1), input_shape=(1, 1, 2048)))
for idx in range(num_backward_layers):
    if idx == num_backward_layers - 1:
        merge_mode = 'concat'
    else:
        merge_mode = 'sum'
    bdrnn_model.add(Bidirectional(GRU(rnn_size, return_sequences=True),
                                  merge_mode=merge_mode))
bdrnn_model.add(Dense(num_output_characters + 2, activation='relu'))

有没有办法让Keras使用“bdrnn_模型”作为卷积滤波器?

使用模型作为卷积滤波器是什么意思?你只想以滑动窗口的方式应用模型?@MatiasValdenegro我想在循环堆栈中使用一组经过训练的参数作为过滤器,应用于大小为(128 x 64)的窗口的每个滑动窗口位置。tensorflow后端的conv2d函数允许将可训练过滤器指定为tensor。我很难将叠加RNN表示为张量。有没有比通过tensorflow后端的conv2d函数更简单的方法来实现这一点?没有,您不能将模型用作卷积滤波器,您需要的是一个滑动窗口,而TF的conv2d函数无法实现这一点。不能将模型转换为张量。您可以随时手动滑动窗口。@MatiasValdenegro感谢您为我指明了正确的方向。我将使用大小等于窗口大小的图像训练模型,并使用较大图像的切片对其进行评估。