Python lstm输出的串联
我正在尝试建立一个多任务图像字幕模型,它包含两个单独的编码-解码模型和LSTM,每个模型都从不同的数据集获取输入,然后LSTM的输出通过级联进行组合,级联层的输出然后传递到稠密。 以下是模型代码:Python lstm输出的串联,python,tensorflow,keras,lstm,Python,Tensorflow,Keras,Lstm,我正在尝试建立一个多任务图像字幕模型,它包含两个单独的编码-解码模型和LSTM,每个模型都从不同的数据集获取输入,然后LSTM的输出通过级联进行组合,级联层的输出然后传递到稠密。 以下是模型代码: def define_model(vocab_size1, max_length1, vocab_size2, max_length2): # first inputs1 = Input(shape=(4096,)) print(inputs1.shape) fe1_1 = Dro
def define_model(vocab_size1, max_length1, vocab_size2, max_length2):
# first
inputs1 = Input(shape=(4096,))
print(inputs1.shape)
fe1_1 = Dropout(0.5)(inputs1)
fe2_1 = Dense(EMBEDDING_DIM, activation='relu')(fe1_1)
fe3_1 = RepeatVector(max_length1)(fe2_1)
inputs2 = Input(shape=(max_length1,))
print(inputs2.shape)
emb2_1 = Embedding(vocab_size1, EMBEDDING_DIM, mask_zero=True)(inputs2)
merged1 = concatenate([fe3_1, emb2_1], name='concat1')
lm2_1 = LSTM(500, return_sequences=False)(merged1)
#second
inputs3 = Input(shape=(4096,))
fe1_2 = Dropout(0.5)(inputs3)
fe2_2 = Dense(EMBEDDING_DIM, activation='relu')(fe1_2)
fe3_2 = RepeatVector(max_length2)(fe2_2)
inputs4 = Input(shape=(max_length2,))
emb2_2 = Embedding(vocab_size2, EMBEDDING_DIM, mask_zero=True)(inputs4)
merged2 = concatenate([fe3_2, emb2_2], name='concat2')
lm2_2 = LSTM(500, return_sequences=False)(merged2)
# merge
merged3 = concatenate([lm2_1, lm2_2], name='concat3') # error
outputs = Dense(vocab_size1, activation='softmax')(merged3)
outputs1 = Dense(vocab_size2, activation='softmax')(merged3)
# tie it together [image, seq] [word]
model = Model(inputs=[inputs1, inputs2, inputs3, inputs4], outputs=[outputs, outputs1])
model.compile(loss=['categorical_crossentropy', 'categorical_crossentropy'], optimizer='adam', metrics=['accuracy'])
print(model.summary())
# plot_model(model, show_shapes=True, to_file='model.png')
return model
我可以正确初始化它:
model = define_model(fvocab_size, fmax_length, wvocab_size, wmax_length)
Model: "model"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) [(None, 4096)] 0
__________________________________________________________________________________________________
input_3 (InputLayer) [(None, 4096)] 0
__________________________________________________________________________________________________
dropout (Dropout) (None, 4096) 0 input_1[0][0]
__________________________________________________________________________________________________
dropout_1 (Dropout) (None, 4096) 0 input_3[0][0]
__________________________________________________________________________________________________
dense (Dense) (None, 256) 1048832 dropout[0][0]
__________________________________________________________________________________________________
input_2 (InputLayer) [(None, 34)] 0
__________________________________________________________________________________________________
dense_1 (Dense) (None, 256) 1048832 dropout_1[0][0]
__________________________________________________________________________________________________
input_4 (InputLayer) [(None, 21)] 0
__________________________________________________________________________________________________
repeat_vector (RepeatVector) (None, 34, 256) 0 dense[0][0]
__________________________________________________________________________________________________
embedding (Embedding) (None, 34, 256) 1940224 input_2[0][0]
__________________________________________________________________________________________________
repeat_vector_1 (RepeatVector) (None, 21, 256) 0 dense_1[0][0]
__________________________________________________________________________________________________
embedding_1 (Embedding) (None, 21, 256) 1428992 input_4[0][0]
__________________________________________________________________________________________________
concat1 (Concatenate) (None, 34, 512) 0 repeat_vector[0][0]
embedding[0][0]
__________________________________________________________________________________________________
concat2 (Concatenate) (None, 21, 512) 0 repeat_vector_1[0][0]
embedding_1[0][0]
__________________________________________________________________________________________________
lstm (LSTM) (None, 500) 2026000 concat1[0][0]
__________________________________________________________________________________________________
lstm_1 (LSTM) (None, 500) 2026000 concat2[0][0]
__________________________________________________________________________________________________
concat3 (Concatenate) (None, 1000) 0 lstm[0][0]
lstm_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 7579) 7586579 concat3[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 5582) 5587582 concat3[0][0]
==================================================================================================
Total params: 22,693,041
Trainable params: 22,693,041
Non-trainable params: 0
连接的输入形状为(无,500),(无,500),输出形状为(无,1000)。但是,当通过生成器传递实际数据时,我得到一个错误:
`InvalidArgumentError Traceback (most recent call last)
<ipython-input-15-e52b85d1307b> in <module>()
12
13 model.fit(train_generator, epochs=20, verbose=1, steps_per_epoch=steps, validation_steps=val_steps,
---> 14 callbacks=[checkpoint], validation_data=val_generator)
15
16 try:
6 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
1098 _r=1):
1099 callbacks.on_train_batch_begin(step)
-> 1100 tmp_logs = self.train_function(iterator)
1101 if data_handler.should_sync:
1102 context.async_wait()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
826 tracing_count = self.experimental_get_tracing_count()
827 with trace.Trace(self._name) as tm:
--> 828 result = self._call(*args, **kwds)
829 compiler = "xla" if self._experimental_compile else "nonXla"
830 new_tracing_count = self.experimental_get_tracing_count()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
886 # Lifting succeeded, so variables are initialized and we can run the
887 # stateless function.
--> 888 return self._stateless_fn(*args, **kwds)
889 else:
890 _, _, _, filtered_flat_args = \
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in __call__(self, *args, **kwargs)
2941 filtered_flat_args) = self._maybe_define_function(args, kwargs)
2942 return graph_function._call_flat(
-> 2943 filtered_flat_args, captured_inputs=graph_function.captured_inputs) # pylint: disable=protected-access
2944
2945 @property
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _call_flat(self, args, captured_inputs, cancellation_manager)
1917 # No tape is watching; skip to running the function.
1918 return self._build_call_outputs(self._inference_function.call(
-> 1919 ctx, args, cancellation_manager=cancellation_manager))
1920 forward_backward = self._select_forward_and_backward_functions(
1921 args,
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in call(self, ctx, args, cancellation_manager)
558 inputs=args,
559 attrs=attrs,
--> 560 ctx=ctx)
561 else:
562 outputs = execute.execute_with_cancellation(
/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py in quick_execute(op_name, num_outputs, inputs, attrs, ctx, name)
58 ctx.ensure_initialized()
59 tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
---> 60 inputs, attrs, num_outputs)
61 except core._NotOkStatusException as e:
62 if name is not None:
InvalidArgumentError: All dimensions except 1 must match. Input 1 has shape [4 500] and doesn't match input 0 with shape [47 500].
[[node gradient_tape/model/concat3/ConcatOffset (defined at <ipython-input-15-e52b85d1307b>:14) ]] [Op:__inference_train_function_14543]
Function call stack:
train_function`
当只有一个数据集且没有lstms连接(带有简单的图像字幕)时,一切正常
当我调用next(generator)时,错误输入的形状会发生变化,当我理解描述长度时,输入的形状也会发生变化,尽管我使用了填充
Keras关于函数api的教程包含类似于我的例子,称为操纵复杂图拓扑,它也适用于lstms连接,我不明白为什么在我的例子中,如果不进行任何重塑,它就不起作用
我试过:
- 将连接更改为图层。连接
- 将嵌入中的mask_zero=True更改为False
- 为两组标记创建公共标记器 数据集中的描述
- 将连接轴更改为0(然后出现 登录的问题)
none
传递它们,该通道可以采用不同的批量大小。但是,当来自2个LSTM的张量形状(47500)和(4500)到达连接层时,该层无法按照预期在第一个轴上连接它们。因此,在训练时会出现错误,而不是在编译时
如果您试图通过生成器一次生成一个样本(1行数据),那么您可能有二维输入(474096)和(44096)。在这种情况下,您应该将其重新设置为(1,474096)和(1,44096)。这将完全改变您的体系结构,但将与我认为您正在尝试做的事情保持一致
细节- 问题是您将不同大小的批作为输入传递给模型。这是因为第一个通道
none
采用批量大小
让我们一步一步地看看模型中只针对2个输入(Ximages1和Ximages2)发生了什么
第一次通过(对于生成器中的每个批次)
输入层-
input_1 (InputLayer) [(None, 4096)] #(47, 4096) Ximages1
input_3 (InputLayer) [(None, 4096)] #(4, 4096) Ximages2
它们进入中间层,直到到达单个LSTM
LSTM层-
lstm (LSTM) (None, 500) concat1[0][0] #(47, 500)
lstm_1 (LSTM) (None, 500) concat2[0][0] #(4, 500)
现在,下一层“连接”尝试将两个层合并为一个层,如下所示-
concat3 (Concatenate) (None, 1000) lstm[0][0] #(47, 500)
lstm_1[0][0] #(4, 500)
从体系结构的角度来看,它可以在第一个通道(批次大小)上连接(无,500)
和第二个(无,500)
,但是,假设层为每个批次接收相同数量的样本
换句话说,不能在第一个轴上将(47500)
与(4500)
连接起来
- 您可能需要重新考虑如何创建生成器输出批次
- 如果(474096)和(44096)假定为单个样本,则可能希望将它们输出为三维张量,而不是二维(1474096)和(14096)
- 这样,您的输入层将使用(None,474096)和(None,44096)
- 这将相应地更改随后创建的每一层,因为现在必须使用额外的通道
np.array(XSeq1).shape
。。。等等…它是这样的:Ximages1:(474096)Xseq1:(4734)Ximages2:(44096)Xseq2:(4121)y1:(477579)y2:(45582),其中Xseq1中的第二个轴,Xseq2是数据集中描述的最大长度,y1中的第二个轴,y2是词汇的大小,第一个轴值在移动到下一个元素时会发生变化谢谢!是的,我太傻了。我将尝试使用新频道进行管理
concat3 (Concatenate) (None, 1000) lstm[0][0] #(47, 500)
lstm_1[0][0] #(4, 500)