Neural network 冻结张量流2层

Neural network 冻结张量流2层,neural-network,tensorflow2.0,Neural Network,Tensorflow2.0,我有一个用于MNIST数据集的LeNet-300-100密集神经网络,我想冻结前两层,在前两个隐藏层中有300和100个隐藏神经元。我只想训练输出层。我必须这样做的代码如下: from tensorflow import keras inner_model = keras.Sequential( [ keras.Input(shape=(1024,)), keras.layers.Dense(300, activation="relu"

我有一个用于MNIST数据集的LeNet-300-100密集神经网络,我想冻结前两层,在前两个隐藏层中有300和100个隐藏神经元。我只想训练输出层。我必须这样做的代码如下:

from tensorflow import keras

inner_model = keras.Sequential(
    [
        keras.Input(shape=(1024,)),
        keras.layers.Dense(300, activation="relu", kernel_initializer = tf.initializers.GlorotNormal()),
        keras.layers.Dense(100, activation="relu", kernel_initializer = tf.initializers.GlorotNormal()),
    ]
)

model_mnist = keras.Sequential(
    [keras.Input(shape=(1024,)), inner_model, keras.layers.Dense(10, activation="softmax"),]
)

# model_mnist.trainable = True  # Freeze the outer model
# Freeze the inner model-
inner_model.trainable = False


# Sanity check-
inner_model.trainable, model_mnist.trainable
# (False, True)

# Compile NN-
model_mnist.compile(
    loss=tf.keras.losses.categorical_crossentropy,
    # optimizer='adam',
    optimizer=tf.keras.optimizers.Adam(lr = 0.0012),
    metrics=['accuracy'])
    
然而,这段代码似乎并没有冻结前两个隐藏层,他们也在学习。我做错了什么


谢谢

解决方案:在定义神经网络模型时,使用“可训练”参数冻结模型的所需层,如下所示-

model = Sequential()

model.add(Dense(units = 300, activation="relu", kernel_initializer = tf.initializers.GlorotNormal(), trainable = False))

model.add(Dense(units = 100, activation = "relu", kernel_initializer = tf.initializer.GlorotNormal(), trainable = False))

model.add(Dense(units = 10, activation = "softmax"))

# Compile model as usual

解决方案:在定义神经网络模型时,使用“可训练”参数冻结模型的所需层,如下所示-

model = Sequential()

model.add(Dense(units = 300, activation="relu", kernel_initializer = tf.initializers.GlorotNormal(), trainable = False))

model.add(Dense(units = 100, activation = "relu", kernel_initializer = tf.initializer.GlorotNormal(), trainable = False))

model.add(Dense(units = 10, activation = "softmax"))

# Compile model as usual