Python 如何实现ReLU代替Sigmoid函数
该代码应该包含三个数字[(value_1),(value_2),(value_1-value_2)],如果第一个值和第二个值之间的差值为负,则返回“0”;如果差值为正,则返回“1”。到目前为止,它实际上工作得很好 以下是输出:Python 如何实现ReLU代替Sigmoid函数,python,numpy,machine-learning,neural-network,sigmoid,Python,Numpy,Machine Learning,Neural Network,Sigmoid,该代码应该包含三个数字[(value_1),(value_2),(value_1-value_2)],如果第一个值和第二个值之间的差值为负,则返回“0”;如果差值为正,则返回“1”。到目前为止,它实际上工作得很好 以下是输出: 错误:0.497132186092 错误:0.105081486632 错误:0.102115299177 错误:0.100813655802 错误:0.100042420179 错误:0.0995185781466 测试输出 [0.0074706006801269686
错误:0.497132186092
错误:0.105081486632
错误:0.102115299177
错误:0.100813655802
错误:0.100042420179
错误:0.0995185781466
测试输出
[0.0074706006801269686, 0.66687458928464094, 0.66687458928463983, 0.66686236694464551, 0.98341439176739631]
输出
[0.66687459245609326,0.00083944690766060215,0.000839464712854584,0.0074706634783305243,0.0074706634765733968,0.007480987498372226,0.99646513183073093,0.99647100131874755,0.99646513180692531,0.0008394457238723831023,0.99646513180692531,0.98166111861,0.66743972727272729,错误代码:4949490]
如您所见,给定alpha=0.0251的误差(对于梯度下降-通过反复试验发现)仅为9.95%
自从我做了这个程序,我了解到leaky RelU是Sigmoid函数的更好的替代品,因为它比Sigmoid函数优化和学习更快。我想在这个程序中使用numpy实现leaky RelU函数,但我不确定从哪里开始,尤其是它的导数是什么
我如何在这个神经网络中实现泄漏的RelU?我想在这里补充一点,实际上有很多类似RelU的激活函数可以用来代替标准函数:
- 您已经提到了泄漏ReLu(由
alpha
参数化)
- (预备课程)。该公式与泄漏ReLu相同,但允许学习系数
alpha
。另见
- (ELU),试图使平均激活率接近于零,从而加快学习速度:
- (SELU)最近出版。它是ELU的扩展,具有特定的参数选择,具有额外的规范化效果,有助于更快地学习
所有激活及其衍生物。只有在特定条件下,它的优化和学习速度才比sigmoid快(它没有一些sigmoid的缺点,但有自己的缺点,例如所谓的“死relu”问题等,它要复杂得多),此外,如果您需要您的网络返回0和1之间的值,您将需要sigmoid或它的关闭替代无论如何,因为relu是无界的。如果你想自己设计神经网络,我会从这里开始:谢谢你提供的信息。我希望我的网络返回0到1之间的值,因为我需要它来返回经典概率。我有一个关于ReLU函数的问题,为什么有人需要一个输出值不在0和1之间的函数,这是否意味着sigmoid和ReLU函数不能互换?而且我可能无意中标记了你的评论。
import numpy as np
alpha = 0.0251 # as close to true alpha as possible
def nonlinear(x, deriv=False):
if(deriv==True):
return x*(1-x)
return 1/(1+np.e**(-x))
#seed
np.random.seed(1)
#testing sample
test_x = np.array([[251,497,-246],
[299,249,50],
[194,180,14],
[140,148,-8],
[210,140,70]])
#Input Array - This input will be taken directly from a Pong game
X = np.array([[198,200,-2],
[90, 280,-190],
[84, 256,-172],
[140,240,-100],
[114,216,-102],
[72, 95,-23],
[99, 31, 68],
[144, 20, 124],
[640, 216,424],
[32, 464,-432],
[176, 64,112],
[754, 506,248],
[107, 104,3],
[116,101,15]])
#output array - if ball_pos - paddle > 0 move up else move down
Y = np.array([[0,0,0,0,0,0,1,1,1,0,1,1,1,1,]]).T
syn0 = 2*np.random.random((3,14))-1
syn1 = 2*np.random.random((14,14))-1
for j in range(60000):
#forward propagation
l0 = X
l1 = nonlinear(np.dot(l0, syn0))
l2 = nonlinear(np.dot(l1, syn1))
#how much did we miss
l2_error = Y - l2
#multiply how much missed by the slope of sigmoid at the value in l1
l2_delta = l2_error * nonlinear(l2, True)
#how much did l1 contribute to l2 error
#(according to the weights)
l1_error = l2_delta.dot(syn1.T)
#in what direction is the target l1?
# Sure?
l1_delta = l1_error*nonlinear(l1,True)
#update weight
syn1 += alpha * (l1.T.dot(l2_delta))
syn0 += alpha * (l0.T.dot(l1_delta))
# display error
if(j % 10000) == 0:
print("ERROR: " + str(np.mean(np.abs(l2_error))))
#Testing Forward propagation
l0_test = test_x
l1_test = nonlinear(np.dot(l0_test,syn0))
l2_test = nonlinear(np.dot(l1_test,syn1))
#Dress up the array (make it look nice)
l2_test_output = []
for x in range(len(l2_test)):
l2_test_output.append(l2_test[x][0])
print("Test Output")
print(l2_test_output)
#Put all the l2 data in a way I could see it: Just the first probabilites
l2_output = []
for x in range(len(l2)):
l2_output.append(l2[x][0])
print("Output")
print(l2_output)