Python 如何交替训练多个叠层神经网络?(凯拉斯)
假设我定义了四个具有各自损失函数的神经网络模型。 下一个神经网络的输入取决于前一个网络的输出Python 如何交替训练多个叠层神经网络?(凯拉斯),python,tensorflow,keras,deep-learning,neural-network,Python,Tensorflow,Keras,Deep Learning,Neural Network,假设我定义了四个具有各自损失函数的神经网络模型。 下一个神经网络的输入取决于前一个网络的输出 Model1 -> Model2 -> Model3 -> Model4 为简单起见,让四个神经网络如下所示: from tensorflow import keras from tensorflow.keras import layers import numpy as np x = np.random.rand(300,10) y = np.random.rand(300,1)
Model1 -> Model2 -> Model3 -> Model4
为简单起见,让四个神经网络如下所示:
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
x = np.random.rand(300,10)
y = np.random.rand(300,1)
# Model 1
inputs1 = keras.Input(shape=(10))
x1 = layers.Dense(1, activation='relu')(inputs1)
model1 = keras.Model(inputs=inputs1, outputs=x1)
model1.compile(loss="mean_squared_error")
# Model 2
inputs2 = keras.Input(shape=(1))
x2 = layers.Dense(1, activation='relu')(inputs2)
model2 = keras.Model(inputs=inputs2, outputs=x2)
model2.compile(loss="mean_squared_error")
# Model 3
inputs3 = keras.Input(shape=(1))
x3 = layers.Dense(1, activation='relu')(inputs3)
model3 = keras.Model(inputs=inputs3, outputs=x3)
model3.compile(loss="mean_squared_error")
# Model 4
inputs4 = keras.Input(shape=(1)
x4 = layers.Dense(1, activation='relu')(inputs4)
model4 = keras.Model(inputs=inputs4, outputs=x4)
model4.compile(loss="mean_squared_error")
我想堆叠和训练这些模型,一次训练每个模型,同时保持其余网络中的权重不变
因此,在一次训练迭代中,将训练Model1,然后将输出传递给Model2。模型2将被训练,然后输出传递给模型3,依此类推,然后另一个纪元将被启动
有什么想法吗?根据您的查询,有两个要求:
损失
和指标
等3
模型中的每一个都是可自我培训的。在我们的演示中,我们将使用图像数据集(mnist
)
让我们努力做到这一点
MNIST
数据集和3个模型,例如model_1
、model_2
和model_3
。虽然这些模型将分别在MNIST
的x\u-train
和y\u-train(softmax)
上进行训练,但我们将从model\u 1
中获得适当的特征映射,并将其设置为model\u 2
的输入,以此类推model_1
,并获取一个适当的特征映射,并将其传递给model_2
,作为其输入,从而输入model_3
x = model_1()
y = model_2(x.get_layer('conv2_model1').output_shape[1:]) # feat_map: (24, 24, 32)
z = model_3(y.get_layer('conv2_model2').output_shape[1:]) # feat_map: (20, 20, 16)
print(x.get_layer('conv2_model1').output_shape[1:])
print(y.get_layer('conv2_model2').output_shape[1:])
print(x.output_shape, y.output_shape, z.output_shape)
# (24, 24, 32)
# (20, 20, 16)
# (None, 10) (None, 10) (None, 10)
x
、y
、和z
)。但在此之前,我们需要了解我们迄今为止所做的工作。我们构建模型x
,其输入将是(正如我们前面提到的)28x28x1
。模型y
,其输入是来自模型x
的层conv2\u model1
的输出特征图。在tf中。keras
我们可以通过以下方式实现:x
的输出特征映射的,尤其是它的层conv2\u model1
(24,24,32)
。现在,我们可以使用这些特征映射作为下一个模型的输入,即modely
。同样,在训练模型y
之后,我们可以得到下一个模型的另一个特征映射,在我们的例子中是modelz
y.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img2 (InputLayer) [(None, 24, 24, 32)] 0
_________________________________________________________________
conv1_model2 (Conv2D) (None, 22, 22, 32) 9248
_________________________________________________________________
conv2_model2 (Conv2D) (None, 20, 20, 16) 4624
_________________________________________________________________
gap2 (GlobalAveragePooling2D (None, 16) 0
_________________________________________________________________
pred_2 (Dense) (None, 10) 170
=================================================================
# calling model y and
# using `pred_x` for input as it should be now for model `y`
y.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = ['accuracy'],
optimizer = tf.keras.optimizers.Adam())
y.fit(pred_x, data_y, epochs=1)
# 8ms/step - loss: 2.0186 - accuracy: 0.3000
# grab the proper feature maps from model `y`'s layer (`conv2_model2`) which
# would be the input for model `z`.
pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output)
pred_y = pred_y_model(pred_x)
print(pred_y.shape)
# (10, 20, 20, 16)
z.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img3 (InputLayer) [(None, 20, 20, 16)] 0
_________________________________________________________________
conv1_model3 (Conv2D) (None, 18, 18, 32) 4640
_________________________________________________________________
conv2_model3 (Conv2D) (None, 16, 16, 16) 4624
_________________________________________________________________
gap3 (GlobalAveragePooling2D (None, 16) 0
_________________________________________________________________
pred_3 (Dense) (None, 10) 170
=================================================================
# calling model z
# set `pred_y` as its input
z.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = ['accuracy'],
optimizer = tf.keras.optimizers.Adam())
z.fit(pred_y, data_y, epochs=1)
# 5ms/step - loss: 1.9422 - accuracy: 0.3000
# as we have 3 models, our model cycling end here.
# we take the output from model z's last softmax layer
pred_z = z(pred_y)
print(pred_z.shape)
# (10, 10)
因此,我们有了pred_y
,一个大小为(20,20,16)
的特征图,可以设置为模型z
的输入
y.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img2 (InputLayer) [(None, 24, 24, 32)] 0
_________________________________________________________________
conv1_model2 (Conv2D) (None, 22, 22, 32) 9248
_________________________________________________________________
conv2_model2 (Conv2D) (None, 20, 20, 16) 4624
_________________________________________________________________
gap2 (GlobalAveragePooling2D (None, 16) 0
_________________________________________________________________
pred_2 (Dense) (None, 10) 170
=================================================================
# calling model y and
# using `pred_x` for input as it should be now for model `y`
y.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = ['accuracy'],
optimizer = tf.keras.optimizers.Adam())
y.fit(pred_x, data_y, epochs=1)
# 8ms/step - loss: 2.0186 - accuracy: 0.3000
# grab the proper feature maps from model `y`'s layer (`conv2_model2`) which
# would be the input for model `z`.
pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output)
pred_y = pred_y_model(pred_x)
print(pred_y.shape)
# (10, 20, 20, 16)
z.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img3 (InputLayer) [(None, 20, 20, 16)] 0
_________________________________________________________________
conv1_model3 (Conv2D) (None, 18, 18, 32) 4640
_________________________________________________________________
conv2_model3 (Conv2D) (None, 16, 16, 16) 4624
_________________________________________________________________
gap3 (GlobalAveragePooling2D (None, 16) 0
_________________________________________________________________
pred_3 (Dense) (None, 10) 170
=================================================================
# calling model z
# set `pred_y` as its input
z.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = ['accuracy'],
optimizer = tf.keras.optimizers.Adam())
z.fit(pred_y, data_y, epochs=1)
# 5ms/step - loss: 1.9422 - accuracy: 0.3000
# as we have 3 models, our model cycling end here.
# we take the output from model z's last softmax layer
pred_z = z(pred_y)
print(pred_z.shape)
# (10, 10)
为什么不把模型1,2,3作为连续的层,建立一个单一的模型?在实际中,模型更复杂,并且对于不同的任务。上面描述的模型只是为了简单,所以问题是清楚的。请考虑对已接受的答案进行投票。我们将不胜感激。谢谢——)
y.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img2 (InputLayer) [(None, 24, 24, 32)] 0
_________________________________________________________________
conv1_model2 (Conv2D) (None, 22, 22, 32) 9248
_________________________________________________________________
conv2_model2 (Conv2D) (None, 20, 20, 16) 4624
_________________________________________________________________
gap2 (GlobalAveragePooling2D (None, 16) 0
_________________________________________________________________
pred_2 (Dense) (None, 10) 170
=================================================================
# calling model y and
# using `pred_x` for input as it should be now for model `y`
y.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = ['accuracy'],
optimizer = tf.keras.optimizers.Adam())
y.fit(pred_x, data_y, epochs=1)
# 8ms/step - loss: 2.0186 - accuracy: 0.3000
# grab the proper feature maps from model `y`'s layer (`conv2_model2`) which
# would be the input for model `z`.
pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output)
pred_y = pred_y_model(pred_x)
print(pred_y.shape)
# (10, 20, 20, 16)
z.summary()
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
img3 (InputLayer) [(None, 20, 20, 16)] 0
_________________________________________________________________
conv1_model3 (Conv2D) (None, 18, 18, 32) 4640
_________________________________________________________________
conv2_model3 (Conv2D) (None, 16, 16, 16) 4624
_________________________________________________________________
gap3 (GlobalAveragePooling2D (None, 16) 0
_________________________________________________________________
pred_3 (Dense) (None, 10) 170
=================================================================
# calling model z
# set `pred_y` as its input
z.compile(
loss = tf.keras.losses.CategoricalCrossentropy(),
metrics = ['accuracy'],
optimizer = tf.keras.optimizers.Adam())
z.fit(pred_y, data_y, epochs=1)
# 5ms/step - loss: 1.9422 - accuracy: 0.3000
# as we have 3 models, our model cycling end here.
# we take the output from model z's last softmax layer
pred_z = z(pred_y)
print(pred_z.shape)
# (10, 10)