Tensorflow “的意义”;“可培训”;及;“培训”;tf.layers.batch_规范化中的标志
tf.layers.batch_规范化中“可训练”和“训练”标志的意义是什么?在训练和预测过程中,这两种模式有何不同?Tensorflow “的意义”;“可培训”;及;“培训”;tf.layers.batch_规范化中的标志,tensorflow,batch-normalization,Tensorflow,Batch Normalization,tf.layers.batch_规范化中“可训练”和“训练”标志的意义是什么?在训练和预测过程中,这两种模式有何不同?训练控制是使用训练模式batchnorm(使用此小批量的统计数据)还是使用推理模式batchnorm(使用整个训练数据的平均统计数据)可培训控制在batchnorm过程中创建的变量本身是否可培训。批次规范有两个阶段: 1. Training: - Normalize layer activations using `moving_avg`, `moving_var`, `
训练
控制是使用训练模式batchnorm(使用此小批量的统计数据)还是使用推理模式batchnorm(使用整个训练数据的平均统计数据)<代码>可培训控制在batchnorm过程中创建的变量本身是否可培训。批次规范有两个阶段:
1. Training:
- Normalize layer activations using `moving_avg`, `moving_var`, `beta` and `gamma`
(`training`* should be `True`.)
- update the `moving_avg` and `moving_var` statistics.
(`trainable` should be `True`)
2. Inference:
- Normalize layer activations using `beta` and `gamma`.
(`training` should be `False`)
#random image
img = np.random.randint(0,10,(2,2,4)).astype(np.float32)
# batch norm params initialized
beta = np.ones((4)).astype(np.float32)*1 # all ones
gamma = np.ones((4)).astype(np.float32)*2 # all twos
moving_mean = np.zeros((4)).astype(np.float32) # all zeros
moving_var = np.ones((4)).astype(np.float32) # all ones
#Placeholders for input image
_input = tf.placeholder(tf.float32, shape=(1,2,2,4), name='input')
#batch Norm
out = tf.layers.batch_normalization(
_input,
beta_initializer=tf.constant_initializer(beta),
gamma_initializer=tf.constant_initializer(gamma),
moving_mean_initializer=tf.constant_initializer(moving_mean),
moving_variance_initializer=tf.constant_initializer(moving_var),
training=False, trainable=False)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
init_op = tf.global_variables_initializer()
## 2. Run the graph in a session
with tf.Session() as sess:
# init the variables
sess.run(init_op)
for i in range(2):
ops, o = sess.run([update_ops, out], feed_dict={_input: np.expand_dims(img, 0)})
print('beta', sess.run('batch_normalization/beta:0'))
print('gamma', sess.run('batch_normalization/gamma:0'))
print('moving_avg',sess.run('batch_normalization/moving_mean:0'))
print('moving_variance',sess.run('batch_normalization/moving_variance:0'))
print('out', np.round(o))
print('')
说明少数情况的示例代码:
1. Training:
- Normalize layer activations using `moving_avg`, `moving_var`, `beta` and `gamma`
(`training`* should be `True`.)
- update the `moving_avg` and `moving_var` statistics.
(`trainable` should be `True`)
2. Inference:
- Normalize layer activations using `beta` and `gamma`.
(`training` should be `False`)
#random image
img = np.random.randint(0,10,(2,2,4)).astype(np.float32)
# batch norm params initialized
beta = np.ones((4)).astype(np.float32)*1 # all ones
gamma = np.ones((4)).astype(np.float32)*2 # all twos
moving_mean = np.zeros((4)).astype(np.float32) # all zeros
moving_var = np.ones((4)).astype(np.float32) # all ones
#Placeholders for input image
_input = tf.placeholder(tf.float32, shape=(1,2,2,4), name='input')
#batch Norm
out = tf.layers.batch_normalization(
_input,
beta_initializer=tf.constant_initializer(beta),
gamma_initializer=tf.constant_initializer(gamma),
moving_mean_initializer=tf.constant_initializer(moving_mean),
moving_variance_initializer=tf.constant_initializer(moving_var),
training=False, trainable=False)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
init_op = tf.global_variables_initializer()
## 2. Run the graph in a session
with tf.Session() as sess:
# init the variables
sess.run(init_op)
for i in range(2):
ops, o = sess.run([update_ops, out], feed_dict={_input: np.expand_dims(img, 0)})
print('beta', sess.run('batch_normalization/beta:0'))
print('gamma', sess.run('batch_normalization/gamma:0'))
print('moving_avg',sess.run('batch_normalization/moving_mean:0'))
print('moving_variance',sess.run('batch_normalization/moving_variance:0'))
print('out', np.round(o))
print('')
当training=False
和trainiable=False
时:
当training=True
和trainiable=False
时:
当traning=True
和trainable=True
时:
这相当复杂。
在TF 2.0中,行为发生了变化,请参见:
关于在BatchNormalization
层上设置layer.trainable=False
:
1. Training:
- Normalize layer activations using `moving_avg`, `moving_var`, `beta` and `gamma`
(`training`* should be `True`.)
- update the `moving_avg` and `moving_var` statistics.
(`trainable` should be `True`)
2. Inference:
- Normalize layer activations using `beta` and `gamma`.
(`training` should be `False`)
#random image
img = np.random.randint(0,10,(2,2,4)).astype(np.float32)
# batch norm params initialized
beta = np.ones((4)).astype(np.float32)*1 # all ones
gamma = np.ones((4)).astype(np.float32)*2 # all twos
moving_mean = np.zeros((4)).astype(np.float32) # all zeros
moving_var = np.ones((4)).astype(np.float32) # all ones
#Placeholders for input image
_input = tf.placeholder(tf.float32, shape=(1,2,2,4), name='input')
#batch Norm
out = tf.layers.batch_normalization(
_input,
beta_initializer=tf.constant_initializer(beta),
gamma_initializer=tf.constant_initializer(gamma),
moving_mean_initializer=tf.constant_initializer(moving_mean),
moving_variance_initializer=tf.constant_initializer(moving_var),
training=False, trainable=False)
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
init_op = tf.global_variables_initializer()
## 2. Run the graph in a session
with tf.Session() as sess:
# init the variables
sess.run(init_op)
for i in range(2):
ops, o = sess.run([update_ops, out], feed_dict={_input: np.expand_dims(img, 0)})
print('beta', sess.run('batch_normalization/beta:0'))
print('gamma', sess.run('batch_normalization/gamma:0'))
print('moving_avg',sess.run('batch_normalization/moving_mean:0'))
print('moving_variance',sess.run('batch_normalization/moving_variance:0'))
print('out', np.round(o))
print('')
设置layer.trainable=False
的意义是冻结
层,即其内部状态在培训期间不会改变:在
fit()
或
train\u on\u batch()
,其状态更新将不会运行。通常
这并不一定意味着该层在推理中运行模式(通常由
training
参数控制,该参数可以
调用层时传递)。“冻结状态”和“推理模式”这是两个独立的概念 但是,在
BatchNormalization
层的情况下,设置层上的
trainable=False
表示层将
随后在推理模式下运行(意味着它将使用
移动平均值和移动方差,以规范化当前批次,而不是使用当前批次的平均值和方差)。这 TensorFlow 2.0中引入了行为,以便
layer.trainable=False
生成最常见的预期
convnet微调用例中的行为。请注意:
- 此行为仅在TensorFlow 2.0之后发生。在1.*中,设置
layer.trainable=False将冻结该层,但不会冻结 将其切换到推理模式
- 在包含其他层的模型上设置
,将递归设置所有内层的trainable
值trainable
- 如果在模型上调用
后更改了compile()
属性的值,则新值对此不生效 建模,直到再次调用trainable
compile()
谢谢@vijay M在推理过程中,是否应该使用从培训中“学到”的移动平均值和方差对激活进行规范化?