Python 正弦函数的简单神经网络逼近

Python 正弦函数的简单神经网络逼近,python,machine-learning,deep-learning,neural-network,Python,Machine Learning,Deep Learning,Neural Network,我试图编写一个非常简单的神经网络,它包含3层神经元(3,3,1)和校正的线性激活函数,以近似Python中的正弦函数。我的代码可以运行,但我所想到的优化似乎并没有达到预期效果。我对Python和神经网络都相当陌生。有人能给我一些提示,告诉我如何改进我的优化,或者找出一些其他错误吗 import numpy as np import math as math import matplotlib.pyplot as plt np.random.seed(123) input = np.arange

我试图编写一个非常简单的神经网络,它包含3层神经元(3,3,1)和校正的线性激活函数,以近似Python中的正弦函数。我的代码可以运行,但我所想到的优化似乎并没有达到预期效果。我对Python和神经网络都相当陌生。有人能给我一些提示,告诉我如何改进我的优化,或者找出一些其他错误吗

import numpy as np
import math as math
import matplotlib.pyplot as plt

np.random.seed(123)
input = np.arange(0, 6.2, 0.1)

weights1 = np.random.rand(3,1)*2-[[1],[1],[1]]
weights2 = np.random.rand(3,3)*2-[[1],[1],[1]]
weights3 = np.random.rand(1,3)*2-[[1]]
biases1 = np.random.rand(3,1)*2-[[1],[1],[1]]
biases2 = np.random.rand(3,1)*2-[[1],[1],[1]]
bias3 = np.random.rand(1,1)*2-[[1]]

def layer1(input1, weights1, biases1) :
    return np.maximum([[0],[0],[0]],np.dot(weights1,input1)+biases1)

def layer2(input2, weights2, biases2) :
    return np.maximum([[0],[0],[0]],np.dot(weights2,input2)+biases2)

def layer3(input3, weights3, bias3) :
    return np.maximum([[0]],np.dot(weights3,input3)+bias3)
我想到的优化只是将权重和偏差移动到减少输入点平均平方误差的方向上

delta = 0.0001
for i in range(1000):
    weights1_plus = weights1 + delta
    weights2_plus = weights2 + delta
    weights3_plus = weights3 + delta
    biases1_plus = biases1 + delta
    biases2_plus = biases2 + delta
    bias3_plus = bias3 + delta
    weights1_minus = weights1 - delta
    weights2_minus = weights2 - delta
    weights3_minus = weights3 - delta
    biases1_minus = biases1- delta
    biases2_minus = biases2 - delta
    bias3_minus = bias3 - delta
    
    mse_plus = 0
    mse_minus = 0
    for x in inputs :
        mse_plus += (math.sin(x)-layer3(layer2(layer1(x, weights1_plus, biases1), weights2, biases2), weights3, bias3))**2
        mse_minus += (math.sin(x)-layer3(layer2(layer1(x, weights1_minus, biases1), weights2, biases2), weights3, bias3))**2
    if mse_plus>mse_minus :
        weights1 = weights1 - 0.005
    else :
        weights1 = weights1 + 0.005
        
    mse_plus = 0
    mse_minus = 0
    for x in inputs :
        mse_plus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1_plus), weights2, biases2), weights3, bias3))**2
        mse_minus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1_minus), weights2, biases2), weights3, bias3))**2
    if mse_plus>mse_minus :
        biases1 = biases1 - 0.005
    else :
        biases1 = biases1 + 0.005
        
    mse_plus = 0
    mse_minus = 0
    for x in inputs :
        mse_plus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2_plus, biases2), weights3, bias3))**2
        mse_minus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2_minus, biases2), weights3, bias3))**2
    if mse_plus>mse_minus :
        weights2 = weights2 - 0.005
    else :
        weights2 = weights2 + 0.005
        
    mse_plus = 0
    mse_minus = 0
    for x in inputs :
        mse_plus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2, biases2_plus), weights3, bias3))**2
        mse_minus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2, biases2_minus), weights3, bias3))**2
    if mse_plus>mse_minus :
        biases2 = biases2 - 0.005
    else :
        biases2 = biases2 + 0.005
        
    mse_plus = 0
    mse_minus = 0
    for x in inputs :
        mse_plus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2, biases2), weights3_plus, bias3))**2
        mse_minus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2, biases2), weights3_minus, bias3))**2
    if mse_plus>mse_minus :
        weights3 = weights3 - 0.005
    else :
        weights3 = weights3 + 0.005
        
    mse_plus = 0
    mse_minus = 0
    for x in inputs :
        mse_plus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2, biases2), weights3, bias3_plus))**2
        mse_minus += (math.sin(x)-layer3(layer2(layer1(x, weights1, biases1), weights2, biases2), weights3, bias3_minus))**2
    if mse_plus>mse_minus :
        bias3 = bias3 - 0.005
    else :
        bias3 = bias3 + 0.005
但如果我画出结果函数,它看起来真的很糟糕

test_inputs = np.arange(0, 6.2, 0.01)
test_outputs = []
for x in test_inputs :
    test_outputs.append(layer3(layer2(layer1(x,weights1, biases1),weights2, biases2),weights3, bias3))
test_outputs = [y[0][0] for y in test_outputs]
plt.plot(test_inputs, test_outputs)

TLDR:您需要实现反向传播。现在,该代码一次更新所有权重。相反,它应该使用预测误差来更新最后一层权重,并计算最后一层中的增量。然后,使用此增量更新上一层的权重等。感谢您的提示!我以前确实读过关于反向传播的文章,但我真的不明白为什么一次更新所有权重不会带来任何结果。