Python Keras：训练期间冻结层不会产生一致的输出_Python_Keras

Python Keras：训练期间冻结层不会产生一致的输出

python keras

Python Keras：训练期间冻结层不会产生一致的输出,python,keras,Python,Keras,根据以下描述，我正在尝试使用keras对模型进行微调：然而，在训练期间，我发现当使用相同的输入时（当所有相关层都被冻结时），网络的输出在训练后不会保持不变，这是我不想要的我构建了以下玩具示例来研究这一点： import keras.applications.resnet50 as resnet50 from keras.layers import Dense, Flatten, Input from keras.models import Model from keras.utils imp

根据以下描述，我正在尝试使用keras对模型进行微调：
然而，在训练期间，我发现当使用相同的输入时（当所有相关层都被冻结时），网络的输出在训练后不会保持不变，这是我不想要的

我构建了以下玩具示例来研究这一点：

import keras.applications.resnet50 as resnet50
from keras.layers import Dense, Flatten, Input
from keras.models import Model
from keras.utils import to_categorical
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
import numpy as np

# data  
i = np.random.rand(1,224,224,3)
X = np.random.rand(32,224,224,3)
y = to_categorical(np.random.randint(751, size=32), num_classes=751)

# model
base_model = resnet50.ResNet50(weights='imagenet', include_top=False, input_tensor=Input(shape=(224,224,3)))
layer = base_model.output
layer = Flatten(name='myflatten')(layer)
layer = Dense(751, activation='softmax', name='fc751')(layer)
model = Model(inputs=base_model.input, outputs=layer)

# freeze all layers
for layer in model.layers:
    layer.trainable = False
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')

# features and predictions before training
feat0 = base_model.predict(i)
pred0 = model.predict(i)
weights0 = model.layers[-1].get_weights()

# before training output is consistent
feat00 = base_model.predict(i)
pred00 = model.predict(i)
print(np.allclose(feat0, feat00)) # True
print(np.allclose(pred0, pred00)) # True

# train
model.fit(X, y, batch_size=2, epochs=3, shuffle=False)

# features and predictions after training
feat1 = base_model.predict(i)
pred1 = model.predict(i)
weights1 = model.layers[-1].get_weights()

# these are not the same
print(np.allclose(feat0, feat1)) # False
# Optionally: printing shows they are in fact very different
# print(feat0)
# print(feat1)

# these are not the same
print(np.allclose(pred0, pred1)) # False
# Optionally: printing shows they are in fact very different
# print(pred0)
# print(pred1)

# these are the same and loss does not change during training
# so layers were actually frozen
print(np.allclose(weights0[0], weights1[0])) # True

# Check again if all layers were in fact untrainable
for layer in model.layers:
     assert layer.trainable == False # All succeed
# Being overly cautious also checking base_model
for layer in base_model.layers:
     assert layer.trainable == False # All succeed

由于我冻结了所有层，我完全希望预测和特征都相等，但令人惊讶的是它们不相等

所以我可能犯了一些错误，但我不知道是什么。。如有任何建议，将不胜感激

因此，问题似乎在于模型使用了批处理规范化层，这些层在训练期间根据所看到的数据更新其内部状态（即权重）。甚至当他们的可训练标志设置为False时，也会发生这种情况。随着权重的更新，输出也会发生变化。您可以使用问题中的代码并更改以下代码行来检查这一点：

这

weights0=model.layers[-1]。获取权重（）
到weights0=model.layers[2]。获取权重（）
这weights1=model.layers[-1]。获取权重（）
到weights1=model.layers[2]。获取权重（）
或任何其他批处理规范化层的索引
因为这样，以下断言将不再有效：

print（np.allclose（weights0，weights1））#现在这是False

据我所知，目前还没有解决方案
另请参见Keras'Github页面上的my。
训练不稳定的另一个原因可能是因为您使用的批量非常小，即batch\u size=2
。至少要使用批量大小=32
。该值太小，批量标准化无法可靠地计算训练分布统计信息（均值和方差）的估计。然后使用这些均值和方差值首先对分布进行归一化，然后学习beta
和gamma
参数（实际分布）
有关详细信息，请查看以下链接：
在引言和相关作品中，作者对BatchNorm进行了批评，并对图1进行了检查：

关于“批量规范的诅咒”的好文章：

你能检查一下X是否被fit generator改变了（可能被洗牌了）吗？这是一个很大的机会，因为生成器将创建/加载数据，而不仅仅是使用随机X。您可以测试model.fit（X，y，batch_size=2，epochs=3）是否也会发生同样的情况。
。这听起来似乎有道理，我今晚一到家就会尝试！我按照你的建议做了，但结果没有改变。。我用更多的测试编辑了这个问题。使用不同的输入进行测试，我证明在训练前输出是一致的，但训练后输出会发生变化。你找到答案/解决方法了吗？我遇到了同样的问题。