Python 反向传播的错误实现_Python_Deep Learning

Python 反向传播的错误实现

python deep-learning

Python 反向传播的错误实现,python,deep-learning,Python,Deep Learning,我试图从头开始用python编写一个神经网络，但我在实现反向传播时遇到了问题。我想我理解了它背后的大部分数学知识，但我想我错过了一些东西，因为我做不到。我也尝试过在互联网上找到的不同代码，但似乎都不管用。如果有人能给我建议或给我指出正确的方向，我将不胜感激这是我的密码： def backpropagation(self,x,y): # Place x in input self.layers[0].input(x) # feed forward y_hat

我试图从头开始用python编写一个神经网络，但我在实现反向传播时遇到了问题。我想我理解了它背后的大部分数学知识，但我想我错过了一些东西，因为我做不到。我也尝试过在互联网上找到的不同代码，但似乎都不管用。如果有人能给我建议或给我指出正确的方向，我将不胜感激

这是我的密码：

 def backpropagation(self,x,y):
    # Place x in input
    self.layers[0].input(x)
    # feed forward 
    y_hat = self.feed_forward()
    for i in reversed(range(len(self.layers))):
        layer = self.layers[i]
        if layer == self.layers[-1]:
            error = y_hat - y
            layer.input_delta = error * self.activation_derivative(y_hat,layer.activation())
        else:
            next_layer = self.layers[i + 1]
            error = np.dot(next_layer.weight(), next_layer.delta())
            layer.input_delta = error * self.activation_derivative(layer.output(),layer.activation())

    for i in range(len(self.layers)):
        layer = self.layers[i]
        layer.input_weight += layer.delta() * self._alpha

这是我的错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
 in 
----> 1 Model.compile(train_data=(X_train,y_train_2),lr = 1 ,validation=(X_test,y_test_2),epoch = 20,batch = 10)

 in compile(self, train_data, lr, validation, epoch, batch)
     40 
     41                 for x, y in zip(X_train,y_train):
---> 42                     self.backpropagation(x,y)
     43 
     44                 True_result = 0

 in backpropagation(self, x, y)
     28             else:
     29                 next_layer = self.layers[i + 1]
---> 30                 error = np.dot(layer.weight(), next_layer.delta())
     31                 layer.input_delta = error * self.activation_derivative(layer.output(),layer.activation())
     32 

<__array_function__ internals> in dot(*args, **kwargs)

ValueError: shapes (4,4) and (2,2) not aligned: 4 (dim 1) != 2 (dim 0)

激活函数

 def activation(self,x,activation):
        if(activation == "Relu"):
            return x * (x > 0)
        elif(activation == "Softmax"):
            e_x = np.exp(x - np.max(x)) 
        return e_x / e_x.sum(axis=0) 

 def activation_derivative(self,x,activation):
        if(activation == "Relu"):
            x[x<=0] = 0
            x[x>0] = 1
            return   x
        elif(activation == "Softmax"):
            deriv = x.reshape(-1,1)
            return np.diagflat(deriv) - np.dot(deriv, deriv.T)

def激活（自激活、x激活）：
如果（激活==“Relu”）：
返回x*（x>0）
elif（激活==“Softmax”）：
e_x=np.exp（x-np.max（x））
返回e_x/e_x.和（轴=0）
def激活_衍生工具（自身、x、激活）：
如果（激活==“Relu”）：
x[x0]=1
返回x
elif（激活==“Softmax”）：
deriv=x.重塑（-1,1）
返回np.diagflat（deriv）-np.dot（deriv，deriv.T）

layer.weight（）和next_layer.delta（）的大小不同。在“relu”激活后，您可能会弄乱这两个参数的计算。检查BP的公式和您的实现。我再次检查了激活函数，我认为问题不存在：/

 def activation(self,x,activation):
        if(activation == "Relu"):
            return x * (x > 0)
        elif(activation == "Softmax"):
            e_x = np.exp(x - np.max(x)) 
        return e_x / e_x.sum(axis=0) 

 def activation_derivative(self,x,activation):
        if(activation == "Relu"):
            x[x<=0] = 0
            x[x>0] = 1
            return   x
        elif(activation == "Softmax"):
            deriv = x.reshape(-1,1)
            return np.diagflat(deriv) - np.dot(deriv, deriv.T)