Python 神经网络中偏导数的错误值
我正在为虹膜数据集实现一个简单的神经网络分类器。该神经网络有3个输入节点、1个带2个节点的隐藏层和3个输出节点。我已经实现了所有东西,但是偏导数的值计算不正确。我已经筋疲力尽地在寻找解决办法,但还是找不到。 这是我计算偏导数的代码Python 神经网络中偏导数的错误值,python,numpy,neural-network,Python,Numpy,Neural Network,我正在为虹膜数据集实现一个简单的神经网络分类器。该神经网络有3个输入节点、1个带2个节点的隐藏层和3个输出节点。我已经实现了所有东西,但是偏导数的值计算不正确。我已经筋疲力尽地在寻找解决办法,但还是找不到。 这是我计算偏导数的代码 def derivative_cost_function(self,X,Y,thetas): ''' Computes the derivates of Cost function w.r.t input parameters (thetas)
def derivative_cost_function(self,X,Y,thetas):
'''
Computes the derivates of Cost function w.r.t input parameters (thetas)
for given input and labels.
Input:
------
X: can be either a single d X n-dimensional vector or d X n dimensional matrix of inputs
theata: must dk X 1-dimensional vector for representing vectors of k classes
Y: Must be k X n-dimensional label vector
Returns:
------
partial_thetas: a dk X 1-dimensional vector of partial derivatives of cost function w.r.t parameters..
'''
#forward pass
a2, a3=self.forward_pass(X,thetas)
#now back-propogate
# unroll thetas
l1theta, l2theta = self.unroll_thetas(thetas)
nexamples=float(X.shape[1])
# compute delta3, l2theta
a3 = np.array(a3)
a2 = np.array(a2)
Y = np.array(Y)
a3 = a3.T
delta3 = (a3 * (1 - a3)) * (((a3 - Y)/((a3)*(1-a3))))
l2Derivatives = np.dot(delta3, a2)
#print "Layer 2 derivatives shape = ", l2Derivatives.shape
#print "Layer 2 derivatives = ", l2Derivatives
# compute delta2, l1 theta
a2 = a2.T
dotProduct = np.dot(l2theta.T,delta3)
delta2 = dotProduct * (a2) * (1- a2)
l1Derivatives = np.dot(delta2[1:], X.T)
#print "Layer 1 derivatives shape = ", l1Derivatives.shape
#print "Layer 1 derivatives = ", l1Derivatives
#remember to exclude last element of delta2, representing the deltas of bias terms...
# i.e. delta2=delta2[:-1]
# roll thetas into a big vector
thetas=(self.roll_thetas(l1Derivatives,l2Derivatives)).reshape(thetas.shape) # return the same shape as you received
return thetas
为什么不看看我在中的实现呢 衍生品实际上在这里:
def dCostFunction(self, theta, in_dim, hidden_dim, num_labels, X, y):
#compute gradient
t1, t2 = self.uncat(theta, in_dim, hidden_dim)
a1, z2, a2, z3, a3 = self._forward(X, t1, t2) # p x s matrix
# t1 = t1[1:, :] # remove bias term
# t2 = t2[1:, :]
sigma3 = -(y - a3) * self.dactivation(z3) # do not apply dsigmode here? should I
sigma2 = np.dot(t2, sigma3)
term = np.ones((1,num_labels))
sigma2 = sigma2 * np.concatenate((term, self.dactivation(z2)),axis=0)
theta2_grad = np.dot(sigma3, a2.T)
theta1_grad = np.dot(sigma2[1:,:], a1.T)
theta1_grad = theta1_grad / num_labels
theta2_grad = theta2_grad / num_labels
return self.cat(theta1_grad.T, theta2_grad.T)
希望能有所帮助谢谢你的回复,我会看看你的代码。我已经编辑了我的代码,并将导数除以示例数,就像你在你的代码中所做的那样。但它们的值仍然不正确:/我无法找出问题出在哪里。在计算导数时,是否删除偏差项?不,我没有这样做。我们应该删除有偏见的条款吗?对不起,我跛脚了,我是机器学习新手。你必须这么做。偏差项不会对上一层产生成本。请看我的代码,我对这一点有意见