Python 如何交替训练多个叠层神经网络?(凯拉斯)

Python 如何交替训练多个叠层神经网络?(凯拉斯),python,tensorflow,keras,deep-learning,neural-network,Python,Tensorflow,Keras,Deep Learning,Neural Network,假设我定义了四个具有各自损失函数的神经网络模型。 下一个神经网络的输入取决于前一个网络的输出 Model1 -> Model2 -> Model3 -> Model4 为简单起见,让四个神经网络如下所示: from tensorflow import keras from tensorflow.keras import layers import numpy as np x = np.random.rand(300,10) y = np.random.rand(300,1)

假设我定义了四个具有各自损失函数的神经网络模型。 下一个神经网络的输入取决于前一个网络的输出

Model1 -> Model2 -> Model3 -> Model4
为简单起见,让四个神经网络如下所示:

from tensorflow import keras
from tensorflow.keras import layers
import numpy as np

x = np.random.rand(300,10)
y = np.random.rand(300,1)

# Model 1
inputs1 = keras.Input(shape=(10))
x1 = layers.Dense(1, activation='relu')(inputs1)
model1 = keras.Model(inputs=inputs1, outputs=x1)
model1.compile(loss="mean_squared_error")

# Model 2
inputs2 = keras.Input(shape=(1))
x2 = layers.Dense(1, activation='relu')(inputs2)
model2 = keras.Model(inputs=inputs2, outputs=x2)
model2.compile(loss="mean_squared_error")

# Model 3
inputs3 = keras.Input(shape=(1))
x3 = layers.Dense(1, activation='relu')(inputs3)
model3 = keras.Model(inputs=inputs3, outputs=x3)
model3.compile(loss="mean_squared_error")

# Model 4
inputs4 = keras.Input(shape=(1)
x4 = layers.Dense(1, activation='relu')(inputs4)
model4 = keras.Model(inputs=inputs4, outputs=x4)
model4.compile(loss="mean_squared_error")
我想堆叠和训练这些模型,一次训练每个模型,同时保持其余网络中的权重不变

因此,在一次训练迭代中,将训练Model1,然后将输出传递给Model2。模型2将被训练,然后输出传递给模型3,依此类推,然后另一个纪元将被启动


有什么想法吗?

根据您的查询,有两个要求:

  • 一个模型的输出将是后续模型的输入
  • 每次对每个模型进行培训,关注
    损失
    指标
  • 下面是我们需要实现的原理图。让我们考虑“<代码> 3”/COD>模型(<代码> x,<代码> y>代码>和<代码> z >代码>,其中第一个模型的输出将是下一个模型的输入并继续进行。这些
    3
    模型中的每一个都是可自我培训的。在我们的演示中,我们将使用图像数据集(
    mnist


    让我们努力做到这一点


  • 我们将演示一种实现这一点的方法,但希望有一种更方便的方法。我们将使用
    MNIST
    数据集和3个模型,例如
    model_1
    model_2
    model_3
    。虽然这些模型将分别在
    MNIST
    x\u-train
    y\u-train(softmax)
    上进行训练,但我们将从
    model\u 1
    中获得适当的特征映射,并将其设置为
    model\u 2
    的输入,以此类推
  • 现在,我们将调用
    model_1
    ,并获取一个适当的特征映射,并将其传递给
    model_2
    ,作为其输入,从而输入
    model_3

    x = model_1()
    y = model_2(x.get_layer('conv2_model1').output_shape[1:]) # feat_map: (24, 24, 32)
    z = model_3(y.get_layer('conv2_model2').output_shape[1:]) # feat_map: (20, 20, 16)
    
    print(x.get_layer('conv2_model1').output_shape[1:])
    print(y.get_layer('conv2_model2').output_shape[1:])
    print(x.output_shape, y.output_shape, z.output_shape)
    # (24, 24, 32)
    # (20, 20, 16)
    # (None, 10) (None, 10) (None, 10)
    
  • 现在我们将一次训练每个模型(
    x
    y
    、和
    z
    )。但在此之前,我们需要了解我们迄今为止所做的工作。我们构建模型
    x
    ,其输入将是(正如我们前面提到的)
    28x28x1
    。模型
    y
    ,其输入是来自模型
    x
    的层
    conv2\u model1
    的输出特征图。在
    tf中。keras
    我们可以通过以下方式实现:
  • 正如我们所看到的,我们是如何实现模型
    x
    的输出特征映射的,尤其是它的层
    conv2\u model1
    (24,24,32)
    。现在,我们可以使用这些特征映射作为下一个模型的输入,即model
    y
    。同样,在训练模型
    y
    之后,我们可以得到下一个模型的另一个特征映射,在我们的例子中是model
    z

    y.summary()
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    img2 (InputLayer)            [(None, 24, 24, 32)]      0         
    _________________________________________________________________
    conv1_model2 (Conv2D)        (None, 22, 22, 32)        9248      
    _________________________________________________________________
    conv2_model2 (Conv2D)        (None, 20, 20, 16)        4624      
    _________________________________________________________________
    gap2 (GlobalAveragePooling2D (None, 16)                0         
    _________________________________________________________________
    pred_2 (Dense)               (None, 10)                170       
    =================================================================
    
    # calling model y and 
    # using `pred_x` for input as it should be now for model `y`
    y.compile(
        loss = tf.keras.losses.CategoricalCrossentropy(),
        metrics = ['accuracy'],
        optimizer = tf.keras.optimizers.Adam())
    y.fit(pred_x, data_y, epochs=1)
    # 8ms/step - loss: 2.0186 - accuracy: 0.3000
    
    # grab the proper feature maps from model `y`'s layer (`conv2_model2`) which 
    # would be the input for model `z`.
    pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output)
    
    pred_y = pred_y_model(pred_x)
    print(pred_y.shape)
    # (10, 20, 20, 16)
    
    z.summary()
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    img3 (InputLayer)            [(None, 20, 20, 16)]      0         
    _________________________________________________________________
    conv1_model3 (Conv2D)        (None, 18, 18, 32)        4640      
    _________________________________________________________________
    conv2_model3 (Conv2D)        (None, 16, 16, 16)        4624      
    _________________________________________________________________
    gap3 (GlobalAveragePooling2D (None, 16)                0         
    _________________________________________________________________
    pred_3 (Dense)               (None, 10)                170       
    =================================================================
    
    # calling model z 
    # set `pred_y` as its input
    z.compile(
        loss = tf.keras.losses.CategoricalCrossentropy(),
        metrics = ['accuracy'],
        optimizer = tf.keras.optimizers.Adam())
    z.fit(pred_y, data_y, epochs=1)
    # 5ms/step - loss: 1.9422 - accuracy: 0.3000
    
    # as we have 3 models, our model cycling end here.
    # we take the output from model z's last softmax layer
    pred_z = z(pred_y)
    print(pred_z.shape)
    # (10, 10)
    
    因此,我们有了
    pred_y
    ,一个大小为
    (20,20,16)
    的特征图,可以设置为模型
    z
    的输入

    y.summary()
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    img2 (InputLayer)            [(None, 24, 24, 32)]      0         
    _________________________________________________________________
    conv1_model2 (Conv2D)        (None, 22, 22, 32)        9248      
    _________________________________________________________________
    conv2_model2 (Conv2D)        (None, 20, 20, 16)        4624      
    _________________________________________________________________
    gap2 (GlobalAveragePooling2D (None, 16)                0         
    _________________________________________________________________
    pred_2 (Dense)               (None, 10)                170       
    =================================================================
    
    # calling model y and 
    # using `pred_x` for input as it should be now for model `y`
    y.compile(
        loss = tf.keras.losses.CategoricalCrossentropy(),
        metrics = ['accuracy'],
        optimizer = tf.keras.optimizers.Adam())
    y.fit(pred_x, data_y, epochs=1)
    # 8ms/step - loss: 2.0186 - accuracy: 0.3000
    
    # grab the proper feature maps from model `y`'s layer (`conv2_model2`) which 
    # would be the input for model `z`.
    pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output)
    
    pred_y = pred_y_model(pred_x)
    print(pred_y.shape)
    # (10, 20, 20, 16)
    
    z.summary()
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    img3 (InputLayer)            [(None, 20, 20, 16)]      0         
    _________________________________________________________________
    conv1_model3 (Conv2D)        (None, 18, 18, 32)        4640      
    _________________________________________________________________
    conv2_model3 (Conv2D)        (None, 16, 16, 16)        4624      
    _________________________________________________________________
    gap3 (GlobalAveragePooling2D (None, 16)                0         
    _________________________________________________________________
    pred_3 (Dense)               (None, 10)                170       
    =================================================================
    
    # calling model z 
    # set `pred_y` as its input
    z.compile(
        loss = tf.keras.losses.CategoricalCrossentropy(),
        metrics = ['accuracy'],
        optimizer = tf.keras.optimizers.Adam())
    z.fit(pred_y, data_y, epochs=1)
    # 5ms/step - loss: 1.9422 - accuracy: 0.3000
    
    # as we have 3 models, our model cycling end here.
    # we take the output from model z's last softmax layer
    pred_z = z(pred_y)
    print(pred_z.shape)
    # (10, 10)
    

    为什么不把模型1,2,3作为连续的层,建立一个单一的模型?在实际中,模型更复杂,并且对于不同的任务。上面描述的模型只是为了简单,所以问题是清楚的。请考虑对已接受的答案进行投票。我们将不胜感激。谢谢——)
    y.summary()
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    img2 (InputLayer)            [(None, 24, 24, 32)]      0         
    _________________________________________________________________
    conv1_model2 (Conv2D)        (None, 22, 22, 32)        9248      
    _________________________________________________________________
    conv2_model2 (Conv2D)        (None, 20, 20, 16)        4624      
    _________________________________________________________________
    gap2 (GlobalAveragePooling2D (None, 16)                0         
    _________________________________________________________________
    pred_2 (Dense)               (None, 10)                170       
    =================================================================
    
    # calling model y and 
    # using `pred_x` for input as it should be now for model `y`
    y.compile(
        loss = tf.keras.losses.CategoricalCrossentropy(),
        metrics = ['accuracy'],
        optimizer = tf.keras.optimizers.Adam())
    y.fit(pred_x, data_y, epochs=1)
    # 8ms/step - loss: 2.0186 - accuracy: 0.3000
    
    # grab the proper feature maps from model `y`'s layer (`conv2_model2`) which 
    # would be the input for model `z`.
    pred_y_model = tf.keras.Model(y.input, y.get_layer('conv2_model2').output)
    
    pred_y = pred_y_model(pred_x)
    print(pred_y.shape)
    # (10, 20, 20, 16)
    
    z.summary()
    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    img3 (InputLayer)            [(None, 20, 20, 16)]      0         
    _________________________________________________________________
    conv1_model3 (Conv2D)        (None, 18, 18, 32)        4640      
    _________________________________________________________________
    conv2_model3 (Conv2D)        (None, 16, 16, 16)        4624      
    _________________________________________________________________
    gap3 (GlobalAveragePooling2D (None, 16)                0         
    _________________________________________________________________
    pred_3 (Dense)               (None, 10)                170       
    =================================================================
    
    # calling model z 
    # set `pred_y` as its input
    z.compile(
        loss = tf.keras.losses.CategoricalCrossentropy(),
        metrics = ['accuracy'],
        optimizer = tf.keras.optimizers.Adam())
    z.fit(pred_y, data_y, epochs=1)
    # 5ms/step - loss: 1.9422 - accuracy: 0.3000
    
    # as we have 3 models, our model cycling end here.
    # we take the output from model z's last softmax layer
    pred_z = z(pred_y)
    print(pred_z.shape)
    # (10, 10)