Python 反向传播的错误实现
我试图从头开始用python编写一个神经网络,但我在实现反向传播时遇到了问题。我想我理解了它背后的大部分数学知识,但我想我错过了一些东西,因为我做不到。我也尝试过在互联网上找到的不同代码,但似乎都不管用。如果有人能给我建议或给我指出正确的方向,我将不胜感激 这是我的密码:Python 反向传播的错误实现,python,deep-learning,Python,Deep Learning,我试图从头开始用python编写一个神经网络,但我在实现反向传播时遇到了问题。我想我理解了它背后的大部分数学知识,但我想我错过了一些东西,因为我做不到。我也尝试过在互联网上找到的不同代码,但似乎都不管用。如果有人能给我建议或给我指出正确的方向,我将不胜感激 这是我的密码: def backpropagation(self,x,y): # Place x in input self.layers[0].input(x) # feed forward y_hat
def backpropagation(self,x,y):
# Place x in input
self.layers[0].input(x)
# feed forward
y_hat = self.feed_forward()
for i in reversed(range(len(self.layers))):
layer = self.layers[i]
if layer == self.layers[-1]:
error = y_hat - y
layer.input_delta = error * self.activation_derivative(y_hat,layer.activation())
else:
next_layer = self.layers[i + 1]
error = np.dot(next_layer.weight(), next_layer.delta())
layer.input_delta = error * self.activation_derivative(layer.output(),layer.activation())
for i in range(len(self.layers)):
layer = self.layers[i]
layer.input_weight += layer.delta() * self._alpha
这是我的错误
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in
----> 1 Model.compile(train_data=(X_train,y_train_2),lr = 1 ,validation=(X_test,y_test_2),epoch = 20,batch = 10)
in compile(self, train_data, lr, validation, epoch, batch)
40
41 for x, y in zip(X_train,y_train):
---> 42 self.backpropagation(x,y)
43
44 True_result = 0
in backpropagation(self, x, y)
28 else:
29 next_layer = self.layers[i + 1]
---> 30 error = np.dot(layer.weight(), next_layer.delta())
31 layer.input_delta = error * self.activation_derivative(layer.output(),layer.activation())
32
<__array_function__ internals> in dot(*args, **kwargs)
ValueError: shapes (4,4) and (2,2) not aligned: 4 (dim 1) != 2 (dim 0)
激活函数
def activation(self,x,activation):
if(activation == "Relu"):
return x * (x > 0)
elif(activation == "Softmax"):
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
def activation_derivative(self,x,activation):
if(activation == "Relu"):
x[x<=0] = 0
x[x>0] = 1
return x
elif(activation == "Softmax"):
deriv = x.reshape(-1,1)
return np.diagflat(deriv) - np.dot(deriv, deriv.T)
def激活(自激活、x激活):
如果(激活==“Relu”):
返回x*(x>0)
elif(激活==“Softmax”):
e_x=np.exp(x-np.max(x))
返回e_x/e_x.和(轴=0)
def激活_衍生工具(自身、x、激活):
如果(激活==“Relu”):
x[x0]=1
返回x
elif(激活==“Softmax”):
deriv=x.重塑(-1,1)
返回np.diagflat(deriv)-np.dot(deriv,deriv.T)
layer.weight()和next_layer.delta()的大小不同。在“relu”激活后,您可能会弄乱这两个参数的计算。检查BP的公式和您的实现。我再次检查了激活函数,我认为问题不存在:/
def activation(self,x,activation):
if(activation == "Relu"):
return x * (x > 0)
elif(activation == "Softmax"):
e_x = np.exp(x - np.max(x))
return e_x / e_x.sum(axis=0)
def activation_derivative(self,x,activation):
if(activation == "Relu"):
x[x<=0] = 0
x[x>0] = 1
return x
elif(activation == "Softmax"):
deriv = x.reshape(-1,1)
return np.diagflat(deriv) - np.dot(deriv, deriv.T)