Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/312.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python ReLu激活函数无法工作_Python_Numpy_Neural Network_Relu - Fatal编程技术网

Python ReLu激活函数无法工作

Python ReLu激活函数无法工作,python,numpy,neural-network,relu,Python,Numpy,Neural Network,Relu,我的第一个神经网络使用的是sigmoid激活函数,运行良好。现在我想切换到更高级的激活功能RELU。但是有了ReLu,我的NN根本不起作用。90%的错误,使用sigmoid时有4%的错误。我在代码中找不到bug。帮帮我 class NeuralNetwork: def __init__(self, input_nodes, hidden_nodes, output_nodes, learning_rate = 0.1): self.input_nodes = input_

我的第一个神经网络使用的是sigmoid激活函数,运行良好。现在我想切换到更高级的激活功能RELU。但是有了ReLu,我的NN根本不起作用。90%的错误,使用sigmoid时有4%的错误。我在代码中找不到bug。帮帮我

class NeuralNetwork:
    def __init__(self, input_nodes, hidden_nodes, output_nodes, learning_rate = 0.1):
        self.input_nodes = input_nodes
        self.hidden_nodes = hidden_nodes
        self.output_nodes = output_nodes
        self.learning_rate = learning_rate

        self.weights_ih = np.random.normal(0.0, pow(input_nodes, -0.5), (hidden_nodes, input_nodes))
        self.weights_ho = np.random.normal(0.0, pow(hidden_nodes, -0.5), (output_nodes, hidden_nodes))
        self.bias_h = np.random.normal(0.0, pow(1, -0.5), (hidden_nodes, 1))
        self.bias_o = np.random.normal(0.0, pow(1, -0.5), (output_nodes, 1))

    def activation_function(self, x):
        return x * (x > 0)

    def activation_function_d(self, x):
        return 1 * (x >= 0)

    def train(self, inputs_list, targets_list):
        inputs = np.array(inputs_list, ndmin=2).T
        targets = np.array(targets_list, ndmin=2).T

        # Feedforward
        hidden_inputs = np.dot(self.weights_ih, inputs) + self.bias_h
        hidden = self.activation_function(hidden_inputs)
        output_inputs = np.dot(self.weights_ho, hidden) + self.bias_o
        outputs = self.activation_function(output_inputs)

        # Calculate errors
        output_errors = targets - outputs
        hidden_errors = np.dot(self.weights_ho.T, output_errors)

        # Calculate gradients
        output_gradient = output_errors * self.activation_function_d(output_inputs) * self.learning_rate
        hidden_gradient = hidden_errors * self.activation_function_d(hidden_inputs) * self.learning_rate

        # Calculate deltas
        output_deltas = np.dot(output_gradient, hidden.T)
        hidden_deltas = np.dot(hidden_gradient, inputs.T)

        # Adjust weights and biases by deltas and gradients
        self.weights_ho += output_deltas
        self.weights_ih += hidden_deltas
        self.bias_o     += output_gradient
        self.bias_h     += hidden_gradient

    def predict(self, inputs_list):
        inputs = np.array(inputs_list, ndmin=2).T
        hidden = self.activation_function(np.dot(self.weights_ih, inputs) + self.bias_h)
        outputs = self.activation_function(np.dot(self.weights_ho, hidden) + self.bias_o)
        return outputs.flatten().tolist()
培训代码:

with open('mnist_train.csv') as train_file:
    for str in train_file:
        data = [int(char) for char in str.split(',')]
        inputs = data[1:]
        targets = [1 if i == data[0] else 0 for i in range(10)]
        nn.train(inputs, targets)

最后一层应始终在二进制情况下使用sigmoid,无论您尝试执行什么操作

sigmoid函数用于估计示例在给定类别中的概率,示例的预测是示例具有最高概率的类别

总之,改变这一点:

def predict(self, inputs_list):
    inputs = np.array(inputs_list, ndmin=2).T
    hidden = self.activation_function(np.dot(self.weights_ih, inputs) + self.bias_h)
    outputs = self.activation_function(np.dot(self.weights_ho, hidden) + self.bias_o)
    return outputs.flatten().tolist()
对此

def predict(self, inputs_list):
    inputs = np.array(inputs_list, ndmin=2).T
    hidden = self.activation_function(np.dot(self.weights_ih, inputs) + self.bias_h)
    outputs = sigmoid(np.dot(self.weights_ho, hidden) + self.bias_o) // create a sigmoid function
    return outputs.flatten().tolist()
在培训中:

    # Feedforward
    hidden_inputs = np.dot(self.weights_ih, inputs) + self.bias_h
    hidden = self.activation_function(hidden_inputs)
    output_inputs = np.dot(self.weights_ho, hidden) + self.bias_o
    outputs = self.activation_function(output_inputs)
致:


为什么在计算梯度时在隐藏层上使用sigmoid,在输出层上使用relu?如果我使用sigmoid,那么我应该将目标规格化为0,1,对吗?如果我使用ReLu,我应该将输入标准化到某个范围吗?尝试相同的范围,根据我的经验,它工作得很好,尽管我认为ReLu不需要标准化到特定的范围。我在输出层用sigmoid替换了ReLu,但得到了30%的错误,在第二次训练后,我得到了50%的错误。然后再次切换到sigmoid,得到6%的错误。你能看看我的密码吗?不管怎样,我都会和你在一起。
    # Feedforward
    hidden_inputs = np.dot(self.weights_ih, inputs) + self.bias_h
    hidden = self.activation_function(hidden_inputs)
    output_inputs = np.dot(self.weights_ho, hidden) + self.bias_o
    outputs = sigmoid(output_inputs)
    # Calculate gradients
    output_gradient = output_errors * self.activation_function_d(output_inputs) * self.learning_rate
    hidden_gradient = hidden_errors * self.activation_function_d(hidden_inputs) * self.learning_rate
       # Calculate gradients
    output_gradient = output_errors * sigmoid_d(output_inputs) * self.learning_rate
    hidden_gradient = hidden_errors * self.activation_function_d(hidden_inputs) * self.learning_rate