Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/http/4.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Machine learning 手工制作的Xavier初始值设定项:lrelu和relu的值是多少_Machine Learning_Initialization_Tensorflow - Fatal编程技术网

Machine learning 手工制作的Xavier初始值设定项:lrelu和relu的值是多少

Machine learning 手工制作的Xavier初始值设定项:lrelu和relu的值是多少,machine-learning,initialization,tensorflow,Machine Learning,Initialization,Tensorflow,作为在中的回答(不是所选答案)的后续:任何人都有想法,在relu中使用哪些值,尤其是leaky relu 我指的是这部分: # use 4 for sigmoid, 1 for tanh activation 这里给出了: (fan_in, fan_out) = ... low = -4*np.sqrt(6.0/(fan_in + fan_out)) # use 4 for sigmoid, 1 for tanh activation high = 4*np.sqrt(6.0

作为在中的回答(不是所选答案)的后续:任何人都有想法,在relu中使用哪些值,尤其是leaky relu

我指的是这部分:

# use 4 for sigmoid, 1 for tanh activation
这里给出了:

(fan_in, fan_out) = ...
    low = -4*np.sqrt(6.0/(fan_in + fan_out)) # use 4 for sigmoid, 1 for tanh activation 
    high = 4*np.sqrt(6.0/(fan_in + fan_out))
    return tf.Variable(tf.random_uniform(shape, minval=low, maxval=high, dtype=tf.float32))
根据公式15,使用ReLu时一层的理论重量方差为:

n*Var[W] = 2
其中n是层大小

如果要使用层内和层外的合并方差,则它将变为:

(fan_in, fan_out) = ...
low = -2*np.sqrt(1.0/(fan_in + fan_out))
high = 2*np.sqrt(1.0/(fan_in + fan_out))
如果您使用的是tensorflow,它们有一个,您可以在其中设置因子变量和模式变量,以控制您希望初始化的方式

如果您对此初始值设定项使用参数因子=2.0的默认设置,您将获得He等人2015建议的ReLu激活的初始化差异。尽管您可以使用参数模式来获得稍微不同的权重初始化差异。仅在图层中使用:

tf.contrib.layers.variance_scaling_initializer(factor=2.0, mode='FAN_IN') 
我将向您提供以下信息:

(fan_in, fan_out) = ...
low = -np.sqrt(2.0/fan_in)
high = np.sqrt(2.0/fan_in)
使用输入层和输出层:

tf.contrib.layers.variance_scaling_initializer(factor=2.0, mode='FAN_AVG')
tf.contrib.layers.variance_scaling_initializer(factor=2.0, mode='FAN_AVG')
将给你:

(fan_in, fan_out) = ...
low = -np.sqrt(4.0/(fan_in+fan_out)) = -2.0*np.sqrt(1.0/(fan_in+fan_out))
high = np.sqrt(4.0/(fan_in+fan_out)) = 2.0*np.sqrt(1.0/(fan_in+fan_out))
(fan_in, fan_out) = ...
low = -np.sqrt(2.0/fan_out)
high = np.sqrt(2.0/fan_out)
仅使用外部图层:

将给你:

(fan_in, fan_out) = ...
low = -np.sqrt(4.0/(fan_in+fan_out)) = -2.0*np.sqrt(1.0/(fan_in+fan_out))
high = np.sqrt(4.0/(fan_in+fan_out)) = 2.0*np.sqrt(1.0/(fan_in+fan_out))
(fan_in, fan_out) = ...
low = -np.sqrt(2.0/fan_out)
high = np.sqrt(2.0/fan_out)