Python 反向传播的错误实现

Python 反向传播的错误实现,python,deep-learning,Python,Deep Learning,我试图从头开始用python编写一个神经网络,但我在实现反向传播时遇到了问题。我想我理解了它背后的大部分数学知识,但我想我错过了一些东西,因为我做不到。我也尝试过在互联网上找到的不同代码,但似乎都不管用。如果有人能给我建议或给我指出正确的方向,我将不胜感激 这是我的密码: def backpropagation(self,x,y): # Place x in input self.layers[0].input(x) # feed forward y_hat

我试图从头开始用python编写一个神经网络,但我在实现反向传播时遇到了问题。我想我理解了它背后的大部分数学知识,但我想我错过了一些东西,因为我做不到。我也尝试过在互联网上找到的不同代码,但似乎都不管用。如果有人能给我建议或给我指出正确的方向,我将不胜感激

这是我的密码:

 def backpropagation(self,x,y):
    # Place x in input
    self.layers[0].input(x)
    # feed forward 
    y_hat = self.feed_forward()
    for i in reversed(range(len(self.layers))):
        layer = self.layers[i]
        if layer == self.layers[-1]:
            error = y_hat - y
            layer.input_delta = error * self.activation_derivative(y_hat,layer.activation())
        else:
            next_layer = self.layers[i + 1]
            error = np.dot(next_layer.weight(), next_layer.delta())
            layer.input_delta = error * self.activation_derivative(layer.output(),layer.activation())

    for i in range(len(self.layers)):
        layer = self.layers[i]
        layer.input_weight += layer.delta() * self._alpha
这是我的错误

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
 in 
----> 1 Model.compile(train_data=(X_train,y_train_2),lr = 1 ,validation=(X_test,y_test_2),epoch = 20,batch = 10)

 in compile(self, train_data, lr, validation, epoch, batch)
     40 
     41                 for x, y in zip(X_train,y_train):
---> 42                     self.backpropagation(x,y)
     43 
     44                 True_result = 0

 in backpropagation(self, x, y)
     28             else:
     29                 next_layer = self.layers[i + 1]
---> 30                 error = np.dot(layer.weight(), next_layer.delta())
     31                 layer.input_delta = error * self.activation_derivative(layer.output(),layer.activation())
     32 

<__array_function__ internals> in dot(*args, **kwargs)

ValueError: shapes (4,4) and (2,2) not aligned: 4 (dim 1) != 2 (dim 0)
激活函数

 def activation(self,x,activation):
        if(activation == "Relu"):
            return x * (x > 0)
        elif(activation == "Softmax"):
            e_x = np.exp(x - np.max(x)) 
        return e_x / e_x.sum(axis=0) 

 def activation_derivative(self,x,activation):
        if(activation == "Relu"):
            x[x<=0] = 0
            x[x>0] = 1
            return   x
        elif(activation == "Softmax"):
            deriv = x.reshape(-1,1)
            return np.diagflat(deriv) - np.dot(deriv, deriv.T) 
def激活(自激活、x激活):
如果(激活==“Relu”):
返回x*(x>0)
elif(激活==“Softmax”):
e_x=np.exp(x-np.max(x))
返回e_x/e_x.和(轴=0)
def激活_衍生工具(自身、x、激活):
如果(激活==“Relu”):
x[x0]=1
返回x
elif(激活==“Softmax”):
deriv=x.重塑(-1,1)
返回np.diagflat(deriv)-np.dot(deriv,deriv.T)

layer.weight()和next_layer.delta()的大小不同。在“relu”激活后,您可能会弄乱这两个参数的计算。检查BP的公式和您的实现。我再次检查了激活函数,我认为问题不存在:/
 def activation(self,x,activation):
        if(activation == "Relu"):
            return x * (x > 0)
        elif(activation == "Softmax"):
            e_x = np.exp(x - np.max(x)) 
        return e_x / e_x.sum(axis=0) 

 def activation_derivative(self,x,activation):
        if(activation == "Relu"):
            x[x<=0] = 0
            x[x>0] = 1
            return   x
        elif(activation == "Softmax"):
            deriv = x.reshape(-1,1)
            return np.diagflat(deriv) - np.dot(deriv, deriv.T)