Machine learning 手工制作的Xavier初始值设定项：lrelu和relu的值是多少_Machine Learning_Initialization_Tensorflow

Machine learning 手工制作的Xavier初始值设定项：lrelu和relu的值是多少

machine-learning tensorflow

Machine learning 手工制作的Xavier初始值设定项：lrelu和relu的值是多少,machine-learning,initialization,tensorflow,Machine Learning,Initialization,Tensorflow,作为在中的回答（不是所选答案）的后续：任何人都有想法，在relu中使用哪些值，尤其是leaky relu 我指的是这部分： # use 4 for sigmoid, 1 for tanh activation 这里给出了： (fan_in, fan_out) = ... low = -4*np.sqrt(6.0/(fan_in + fan_out)) # use 4 for sigmoid, 1 for tanh activation high = 4*np.sqrt(6.0

作为在中的回答（不是所选答案）的后续：任何人都有想法，在relu中使用哪些值，尤其是leaky relu

我指的是这部分：

# use 4 for sigmoid, 1 for tanh activation

这里给出了：

(fan_in, fan_out) = ...
    low = -4*np.sqrt(6.0/(fan_in + fan_out)) # use 4 for sigmoid, 1 for tanh activation 
    high = 4*np.sqrt(6.0/(fan_in + fan_out))
    return tf.Variable(tf.random_uniform(shape, minval=low, maxval=high, dtype=tf.float32))

根据公式15，使用ReLu时一层的理论重量方差为：

n*Var[W] = 2

其中n是层大小

如果要使用层内和层外的合并方差，则它将变为：

(fan_in, fan_out) = ...
low = -2*np.sqrt(1.0/(fan_in + fan_out))
high = 2*np.sqrt(1.0/(fan_in + fan_out))

如果您使用的是tensorflow，它们有一个，您可以在其中设置因子变量和模式变量，以控制您希望初始化的方式

如果您对此初始值设定项使用参数因子=2.0的默认设置，您将获得He等人2015建议的ReLu激活的初始化差异。尽管您可以使用参数模式来获得稍微不同的权重初始化差异。仅在图层中使用：

tf.contrib.layers.variance_scaling_initializer(factor=2.0, mode='FAN_IN')

我将向您提供以下信息：

(fan_in, fan_out) = ...
low = -np.sqrt(2.0/fan_in)
high = np.sqrt(2.0/fan_in)

使用输入层和输出层：

tf.contrib.layers.variance_scaling_initializer(factor=2.0, mode='FAN_AVG')

tf.contrib.layers.variance_scaling_initializer(factor=2.0, mode='FAN_AVG')

将给你：

(fan_in, fan_out) = ...
low = -np.sqrt(4.0/(fan_in+fan_out)) = -2.0*np.sqrt(1.0/(fan_in+fan_out))
high = np.sqrt(4.0/(fan_in+fan_out)) = 2.0*np.sqrt(1.0/(fan_in+fan_out))

(fan_in, fan_out) = ...
low = -np.sqrt(2.0/fan_out)
high = np.sqrt(2.0/fan_out)

仅使用外部图层：

将给你：

(fan_in, fan_out) = ...
low = -np.sqrt(4.0/(fan_in+fan_out)) = -2.0*np.sqrt(1.0/(fan_in+fan_out))
high = np.sqrt(4.0/(fan_in+fan_out)) = 2.0*np.sqrt(1.0/(fan_in+fan_out))

(fan_in, fan_out) = ...
low = -np.sqrt(2.0/fan_out)
high = np.sqrt(2.0/fan_out)