Python 如何交替训练多个叠层神经网络？（凯拉斯）_Python_Tensorflow_Keras_Deep Learning_Neural Network

Python 如何交替训练多个叠层神经网络？（凯拉斯）

python tensorflow keras deep-learning neural-network

Python 如何交替训练多个叠层神经网络？（凯拉斯）,python,tensorflow,keras,deep-learning,neural-network,Python,Tensorflow,Keras,Deep Learning,Neural Network,假设我定义了四个具有各自损失函数的神经网络模型。下一个神经网络的输入取决于前一个网络的输出 Model1 -> Model2 -> Model3 -> Model4 为简单起见，让四个神经网络如下所示： from tensorflow import keras from tensorflow.keras import layers import numpy as np x = np.random.rand(300,10) y = np.random.rand(300,1)

假设我定义了四个具有各自损失函数的神经网络模型。下一个神经网络的输入取决于前一个网络的输出

Model1 -> Model2 -> Model3 -> Model4

为简单起见，让四个神经网络如下所示：

from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

x = np.random.rand(300,10)
y = np.random.rand(300,1)

# Model 1
inputs1 = keras.Input(shape=(10))
x1 = layers.Dense(1, activation='relu')(inputs1)
model1 = keras.Model(inputs=inputs1, outputs=x1)
model1.compile(loss="mean_squared_error")

# Model 2
inputs2 = keras.Input(shape=(1))
x2 = layers.Dense(1, activation='relu')(inputs2)
model2 = keras.Model(inputs=inputs2, outputs=x2)
model2.compile(loss="mean_squared_error")

# Model 3
inputs3 = keras.Input(shape=(1))
x3 = layers.Dense(1, activation='relu')(inputs3)
model3 = keras.Model(inputs=inputs3, outputs=x3)
model3.compile(loss="mean_squared_error")

# Model 4
inputs4 = keras.Input(shape=(1)
x4 = layers.Dense(1, activation='relu')(inputs4)
model4 = keras.Model(inputs=inputs4, outputs=x4)
model4.compile(loss="mean_squared_error")

我想堆叠和训练这些模型，一次训练每个模型，同时保持其余网络中的权重不变

因此，在一次训练迭代中，将训练Model1，然后将输出传递给Model2。模型2将被训练，然后输出传递给模型3，依此类推，然后另一个纪元将被启动

有什么想法吗？

根据您的查询，有两个要求：

一个模型的输出将是后续模型的输入

每次对每个模型进行培训，关注

损失

和

指标

等

下面是我们需要实现的原理图。让我们考虑“<代码> 3”/COD>模型（<代码> x，<代码> y>代码>和<代码> z >代码>，其中第一个模型的输出将是下一个模型的输入并继续进行。这些

模型中的每一个都是可自我培训的。在我们的演示中，我们将使用图像数据集（

mnist

）

让我们努力做到这一点

我们将演示一种实现这一点的方法，但希望有一种更方便的方法。我们将使用

MNIST

数据集和3个模型，例如
model_1
、
model_2
和
model_3
。虽然这些模型将分别在
MNIST
的
x\u-train
和
y\u-train（softmax）
上进行训练，但我们将从
model\u 1
中获得适当的特征映射，并将其设置为
model\u 2
的输入，以此类推

现在，我们将调用
model_1
，并获取一个适当的特征映射，并将其传递给
model_2
，作为其输入，从而输入
model_3

x = model_1() y = model_2(x.get_layer('conv2_model1').output_shape[1:]) # feat_map: (24, 24, 32) z = model_3(y.get_layer('conv2_model2').output_shape[1:]) # feat_map: (20, 20, 16) print(x.get_layer('conv2_model1').output_shape[1:]) print(y.get_layer('conv2_model2').output_shape[1:]) print(x.output_shape, y.output_shape, z.output_shape) # (24, 24, 32) # (20, 20, 16) # (None, 10) (None, 10) (None, 10)

现在我们将一次训练每个模型（
x
、
y
、和
z
）。但在此之前，我们需要了解我们迄今为止所做的工作。我们构建模型
x
，其输入将是（正如我们前面提到的）
28x28x1
。模型
y
，其输入是来自模型
x
的层
conv2\u model1
的输出特征图。在
tf中。keras
我们可以通过以下方式实现：
正如我们所看到的，我们是如何实现模型
x
的输出特征映射的，尤其是它的层
conv2\u model1
（24,24,32）
。现在，我们可以使用这些特征映射作为下一个模型的输入，即model
y
。同样，在训练模型
y
之后，我们可以得到下一个模型的另一个特征映射，在我们的例子中是model
z

y.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img2 (InputLayer) [(None, 24, 24, 32)] 0 _________________________________________________________________ conv1_model2 (Conv2D) (None, 22, 22, 32) 9248 _________________________________________________________________ conv2_model2 (Conv2D) (None, 20, 20, 16) 4624 _________________________________________________________________ gap2 (GlobalAveragePooling2D (None, 16) 0 _________________________________________________________________ pred_2 (Dense) (None, 10) 170 ================================================================= # calling model y and # using `pred_x` for input as it should be now for model `y` y.compile( loss = tf.keras.losses.CategoricalCrossentropy(), metrics = ['accuracy'], optimizer = tf.keras.optimizers.Adam()) y.fit(pred_x, data_y, epochs=1) # 8ms/step - loss: 2.0186 - accuracy: 0.3000 # grab the proper feature maps from model `y`'s layer (`conv2_model2`) which # would be the input for model `z`. pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output) pred_y = pred_y_model(pred_x) print(pred_y.shape) # (10, 20, 20, 16)

z.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img3 (InputLayer) [(None, 20, 20, 16)] 0 _________________________________________________________________ conv1_model3 (Conv2D) (None, 18, 18, 32) 4640 _________________________________________________________________ conv2_model3 (Conv2D) (None, 16, 16, 16) 4624 _________________________________________________________________ gap3 (GlobalAveragePooling2D (None, 16) 0 _________________________________________________________________ pred_3 (Dense) (None, 10) 170 ================================================================= # calling model z # set `pred_y` as its input z.compile( loss = tf.keras.losses.CategoricalCrossentropy(), metrics = ['accuracy'], optimizer = tf.keras.optimizers.Adam()) z.fit(pred_y, data_y, epochs=1) # 5ms/step - loss: 1.9422 - accuracy: 0.3000 # as we have 3 models, our model cycling end here. # we take the output from model z's last softmax layer pred_z = z(pred_y) print(pred_z.shape) # (10, 10)
因此，我们有了
pred_y
，一个大小为
（20,20,16）
的特征图，可以设置为模型
z
的输入

y.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img2 (InputLayer) [(None, 24, 24, 32)] 0 _________________________________________________________________ conv1_model2 (Conv2D) (None, 22, 22, 32) 9248 _________________________________________________________________ conv2_model2 (Conv2D) (None, 20, 20, 16) 4624 _________________________________________________________________ gap2 (GlobalAveragePooling2D (None, 16) 0 _________________________________________________________________ pred_2 (Dense) (None, 10) 170 ================================================================= # calling model y and # using `pred_x` for input as it should be now for model `y` y.compile( loss = tf.keras.losses.CategoricalCrossentropy(), metrics = ['accuracy'], optimizer = tf.keras.optimizers.Adam()) y.fit(pred_x, data_y, epochs=1) # 8ms/step - loss: 2.0186 - accuracy: 0.3000 # grab the proper feature maps from model `y`'s layer (`conv2_model2`) which # would be the input for model `z`. pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output) pred_y = pred_y_model(pred_x) print(pred_y.shape) # (10, 20, 20, 16)

z.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img3 (InputLayer) [(None, 20, 20, 16)] 0 _________________________________________________________________ conv1_model3 (Conv2D) (None, 18, 18, 32) 4640 _________________________________________________________________ conv2_model3 (Conv2D) (None, 16, 16, 16) 4624 _________________________________________________________________ gap3 (GlobalAveragePooling2D (None, 16) 0 _________________________________________________________________ pred_3 (Dense) (None, 10) 170 ================================================================= # calling model z # set `pred_y` as its input z.compile( loss = tf.keras.losses.CategoricalCrossentropy(), metrics = ['accuracy'], optimizer = tf.keras.optimizers.Adam()) z.fit(pred_y, data_y, epochs=1) # 5ms/step - loss: 1.9422 - accuracy: 0.3000 # as we have 3 models, our model cycling end here. # we take the output from model z's last softmax layer pred_z = z(pred_y) print(pred_z.shape) # (10, 10)

为什么不把模型1,2,3作为连续的层，建立一个单一的模型？在实际中，模型更复杂，并且对于不同的任务。上面描述的模型只是为了简单，所以问题是清楚的。请考虑对已接受的答案进行投票。我们将不胜感激。谢谢——）
y.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img2 (InputLayer) [(None, 24, 24, 32)] 0 _________________________________________________________________ conv1_model2 (Conv2D) (None, 22, 22, 32) 9248 _________________________________________________________________ conv2_model2 (Conv2D) (None, 20, 20, 16) 4624 _________________________________________________________________ gap2 (GlobalAveragePooling2D (None, 16) 0 _________________________________________________________________ pred_2 (Dense) (None, 10) 170 ================================================================= # calling model y and # using `pred_x` for input as it should be now for model `y` y.compile( loss = tf.keras.losses.CategoricalCrossentropy(), metrics = ['accuracy'], optimizer = tf.keras.optimizers.Adam()) y.fit(pred_x, data_y, epochs=1) # 8ms/step - loss: 2.0186 - accuracy: 0.3000 # grab the proper feature maps from model `y`'s layer (`conv2_model2`) which # would be the input for model `z`. pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output) pred_y = pred_y_model(pred_x) print(pred_y.shape) # (10, 20, 20, 16)

z.summary() _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= img3 (InputLayer) [(None, 20, 20, 16)] 0 _________________________________________________________________ conv1_model3 (Conv2D) (None, 18, 18, 32) 4640 _________________________________________________________________ conv2_model3 (Conv2D) (None, 16, 16, 16) 4624 _________________________________________________________________ gap3 (GlobalAveragePooling2D (None, 16) 0 _________________________________________________________________ pred_3 (Dense) (None, 10) 170 ================================================================= # calling model z # set `pred_y` as its input z.compile( loss = tf.keras.losses.CategoricalCrossentropy(), metrics = ['accuracy'], optimizer = tf.keras.optimizers.Adam()) z.fit(pred_y, data_y, epochs=1) # 5ms/step - loss: 1.9422 - accuracy: 0.3000 # as we have 3 models, our model cycling end here. # we take the output from model z's last softmax layer pred_z = z(pred_y) print(pred_z.shape) # (10, 10)