Python 什么';Tensorflow中的批处理规范化有什么问题?
我在Kaggle为数字识别器安装CNN。结构如下: conv5x5(过滤器=32)-conv5x5(过滤器=32)-MAXPOL2X2-conv3x3(过滤器=64)-conv3x3(过滤器=64)-MAXPOL2X2 FC(512)-drop(保持概率=0.25)-softmax(10) 该结构在数字识别器中的准确率为99.728% 我想在conv层中添加批处理规范。我这样添加它们:Python 什么';Tensorflow中的批处理规范化有什么问题?,python,tensorflow,convolution,kaggle,batch-normalization,Python,Tensorflow,Convolution,Kaggle,Batch Normalization,我在Kaggle为数字识别器安装CNN。结构如下: conv5x5(过滤器=32)-conv5x5(过滤器=32)-MAXPOL2X2-conv3x3(过滤器=64)-conv3x3(过滤器=64)-MAXPOL2X2 FC(512)-drop(保持概率=0.25)-softmax(10) 该结构在数字识别器中的准确率为99.728% 我想在conv层中添加批处理规范。我这样添加它们: #Forward propagation of the whole CNN# def forward_prop
#Forward propagation of the whole CNN#
def forward_propagation(X, keep_prob_l5, BN_is_training, conv_params, convstride1_shape, convstride2_shape, pool2_shape, poolstride2_shape, convstride3_shape, convstride4_shape, pool4_shape, poolstride4_shape, n_5, n_out):
W1 = conv_params['W1']
b1 = conv_params['b1']
W2 = conv_params['W2']
b2 = conv_params['b2']
W3 = conv_params['W3']
b3 = conv_params['b3']
W4 = conv_params['W4']
b4 = conv_params['b4']
Z1 = tf.nn.bias_add(tf.nn.conv2d(X, W1, strides=convstride1_shape, padding='SAME'), b1, data_format='NHWC')
Z1_bachnorm = tf.contrib.layers.batch_norm(Z1, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A1 = tf.nn.relu(Z1_bachnorm)
Z2 = tf.nn.bias_add(tf.nn.conv2d(A1, W2, strides=convstride2_shape, padding='SAME'), b2, data_format='NHWC')
Z2_bachnorm = tf.contrib.layers.batch_norm(Z2, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A2 = tf.nn.relu(Z2_bachnorm)
P2 = tf.nn.max_pool(A2, ksize=poolstride2_shape, strides=poolstride2_shape, padding='SAME')
Z3 = tf.nn.bias_add(tf.nn.conv2d(P2, W3, strides=convstride3_shape, padding='SAME'), b3, data_format='NHWC')
Z3_bachnorm = tf.contrib.layers.batch_norm(Z3, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A3 = tf.nn.relu(Z3_bachnorm)
Z4 = tf.nn.bias_add(tf.nn.conv2d(A3, W4, strides=convstride4_shape, padding='SAME'), b4, data_format='NHWC')
Z4_bachnorm = tf.contrib.layers.batch_norm(Z4, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A4 = tf.nn.relu(Z4_bachnorm)
P4 = tf.nn.max_pool(A4, ksize=poolstride4_shape, strides=poolstride4_shape, padding='SAME')
P4_flatten = tf.contrib.layers.flatten(P4)
A5 = tf.contrib.layers.fully_connected(P4_flatten, n_5, activation_fn=tf.nn.relu)
A5_drop = tf.nn.dropout(A5, keep_prob_l5)
Z_out = tf.contrib.layers.fully_connected(A5_drop, n_out, activation_fn=None)
return tf.transpose(Z_out)
#Define the optimization method#
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
optimizer = tf.train.AdamOptimizer(learning_rate=decayed_learning_rate).minimize(cost)
其中,BN_是_训练
是训练时为True,推理时为False的占位符
更新操作的设置如下所示:
#Forward propagation of the whole CNN#
def forward_propagation(X, keep_prob_l5, BN_is_training, conv_params, convstride1_shape, convstride2_shape, pool2_shape, poolstride2_shape, convstride3_shape, convstride4_shape, pool4_shape, poolstride4_shape, n_5, n_out):
W1 = conv_params['W1']
b1 = conv_params['b1']
W2 = conv_params['W2']
b2 = conv_params['b2']
W3 = conv_params['W3']
b3 = conv_params['b3']
W4 = conv_params['W4']
b4 = conv_params['b4']
Z1 = tf.nn.bias_add(tf.nn.conv2d(X, W1, strides=convstride1_shape, padding='SAME'), b1, data_format='NHWC')
Z1_bachnorm = tf.contrib.layers.batch_norm(Z1, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A1 = tf.nn.relu(Z1_bachnorm)
Z2 = tf.nn.bias_add(tf.nn.conv2d(A1, W2, strides=convstride2_shape, padding='SAME'), b2, data_format='NHWC')
Z2_bachnorm = tf.contrib.layers.batch_norm(Z2, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A2 = tf.nn.relu(Z2_bachnorm)
P2 = tf.nn.max_pool(A2, ksize=poolstride2_shape, strides=poolstride2_shape, padding='SAME')
Z3 = tf.nn.bias_add(tf.nn.conv2d(P2, W3, strides=convstride3_shape, padding='SAME'), b3, data_format='NHWC')
Z3_bachnorm = tf.contrib.layers.batch_norm(Z3, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A3 = tf.nn.relu(Z3_bachnorm)
Z4 = tf.nn.bias_add(tf.nn.conv2d(A3, W4, strides=convstride4_shape, padding='SAME'), b4, data_format='NHWC')
Z4_bachnorm = tf.contrib.layers.batch_norm(Z4, center=True, scale=True, is_training=BN_is_training, data_format='NHWC')
A4 = tf.nn.relu(Z4_bachnorm)
P4 = tf.nn.max_pool(A4, ksize=poolstride4_shape, strides=poolstride4_shape, padding='SAME')
P4_flatten = tf.contrib.layers.flatten(P4)
A5 = tf.contrib.layers.fully_connected(P4_flatten, n_5, activation_fn=tf.nn.relu)
A5_drop = tf.nn.dropout(A5, keep_prob_l5)
Z_out = tf.contrib.layers.fully_connected(A5_drop, n_out, activation_fn=None)
return tf.transpose(Z_out)
#Define the optimization method#
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
with tf.control_dependencies(update_ops):
optimizer = tf.train.AdamOptimizer(learning_rate=decayed_learning_rate).minimize(cost)
然而,结果真的很奇怪。精度从未提高,成本也在不断增加。我在设置批处理规范时是否出错
谢谢:D