Python 如何避免死区ReLU和NaN输出值?
我最近接到一项任务,要求我从头开始用Python编写一个密集的神经网络。我们应该在使用Sigmoid、Tanh和ReLU激活函数时解决一些回归问题。然而,尽管我的网络确实适用于基于分类的问题,但每当我使用回归网络时,我都会遇到一些问题 首先,如果我使用这个数据库:(我需要用这个数据测试我的网络),每当我尝试训练然后预测,同时使用ReLU作为激活函数时,所有输出都是0。我和我的教授谈过,他说也许我可以尝试使用Leaky ReLU,但如果我这样做,我会得到NaN值。我已经使用非常小的学习率进行训练,比如1*10^(-9),只是为了测试,在这种情况下,我不会得到NaN值,但无论如何,误差总是非常高。我的NN有一个隐藏层,我尝试了多个隐藏节点,看看是否有任何改进,但没有 下面是我如何定义我的激活函数(leaky ReLU): 下面是我的神经网络类定义:Python 如何避免死区ReLU和NaN输出值?,python,tensorflow,machine-learning,neural-network,Python,Tensorflow,Machine Learning,Neural Network,我最近接到一项任务,要求我从头开始用Python编写一个密集的神经网络。我们应该在使用Sigmoid、Tanh和ReLU激活函数时解决一些回归问题。然而,尽管我的网络确实适用于基于分类的问题,但每当我使用回归网络时,我都会遇到一些问题 首先,如果我使用这个数据库:(我需要用这个数据测试我的网络),每当我尝试训练然后预测,同时使用ReLU作为激活函数时,所有输出都是0。我和我的教授谈过,他说也许我可以尝试使用Leaky ReLU,但如果我这样做,我会得到NaN值。我已经使用非常小的学习率进行训练,
class NeuralNetwork():
def __init__(self, x, y, x_test, y_test):
self.input = x #input to NN
self.weights1 = np.random.uniform(-0.5, 0.5, (inputSize,HiddenNodes)) #input to hidden weights
self.weights2 = np.random.uniform(-0.5, 0.5, (HiddenNodes,outputSize))#hidden to output weights
self.y = y #real outputs of the training set
self.output = np.zeros(self.y.shape) #output of NN
self.test_input = x_test
self.outputTest = np.zeros(y_test.shape)
def feedforward(self): #simple feedforward code
self.layer1 = activation(np.dot(self.input, self.weights1))
self.output = activation(np.dot(self.layer1, self.weights2))
return self.output
def backprop(self):
# application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
slopeOut = activation_derivative(self.output)
slopeIn = activation_derivative(self.layer1)
ErrorOut = 2*(self.y - self.output)*slopeOut
ErrorHiddenLayer = np.dot(ErrorOut, self.weights2.T)
d_weights2 = np.dot(self.layer1.T, (ErrorOut)) + (self.lambd)*self.weights2
d_weights1 = np.dot(self.input.T, (ErrorHiddenLayer * slopeIn)) + (self.lambd)*self.weights1
# update the weights with the derivative (slope) of the loss function
sum1 = np.multiply(learning1, d_weights1)
sum2 = np.multiply(learning2, d_weights2)
self.weights1 += sum1
self.weights2 += sum2
def train(self, X, y): #function to train nn
self.output = self.feedforward()
self.backprop()
def test(self, X2, Y2): #function to predict a given value
self.layer1test = activation(np.dot(X2, self.weights1))
self.outputTest = activation(np.dot(self.layer1test, self.weights2))
return self.outputTest
我应该怎么做才能正确使用ReLU激活功能
class NeuralNetwork():
def __init__(self, x, y, x_test, y_test):
self.input = x #input to NN
self.weights1 = np.random.uniform(-0.5, 0.5, (inputSize,HiddenNodes)) #input to hidden weights
self.weights2 = np.random.uniform(-0.5, 0.5, (HiddenNodes,outputSize))#hidden to output weights
self.y = y #real outputs of the training set
self.output = np.zeros(self.y.shape) #output of NN
self.test_input = x_test
self.outputTest = np.zeros(y_test.shape)
def feedforward(self): #simple feedforward code
self.layer1 = activation(np.dot(self.input, self.weights1))
self.output = activation(np.dot(self.layer1, self.weights2))
return self.output
def backprop(self):
# application of the chain rule to find derivative of the loss function with respect to weights2 and weights1
slopeOut = activation_derivative(self.output)
slopeIn = activation_derivative(self.layer1)
ErrorOut = 2*(self.y - self.output)*slopeOut
ErrorHiddenLayer = np.dot(ErrorOut, self.weights2.T)
d_weights2 = np.dot(self.layer1.T, (ErrorOut)) + (self.lambd)*self.weights2
d_weights1 = np.dot(self.input.T, (ErrorHiddenLayer * slopeIn)) + (self.lambd)*self.weights1
# update the weights with the derivative (slope) of the loss function
sum1 = np.multiply(learning1, d_weights1)
sum2 = np.multiply(learning2, d_weights2)
self.weights1 += sum1
self.weights2 += sum2
def train(self, X, y): #function to train nn
self.output = self.feedforward()
self.backprop()
def test(self, X2, Y2): #function to predict a given value
self.layer1test = activation(np.dot(X2, self.weights1))
self.outputTest = activation(np.dot(self.layer1test, self.weights2))
return self.outputTest