用Python实现神经网络的代价函数（第5周课程）_Python_Numpy_Machine Learning

用Python实现神经网络的代价函数（第5周课程）

python numpy machine-learning

用Python实现神经网络的代价函数（第5周课程）,python,numpy,machine-learning,Python,Numpy,Machine Learning,基于Coursera机器学习课程，我试图用python实现神经网络的代价函数。有一个类似的答案，答案是可以接受的，但是答案中的代码是用八度音阶写的。为了不偷懒，我尝试将答案中的相关概念应用到我的案例中，据我所知，我正确地实现了这个函数。然而，我输出的成本与预期成本不同，所以我做错了什么下面是一个可重复的小示例：下面的链接指向一个.npz文件，可以加载该文件（如下所示）以获取相关数据。如果使用该文件，请重命名该文件“arrays.npz” 实际上，cost应该是0.287629，cost+n

基于Coursera机器学习课程，我试图用python实现神经网络的代价函数。有一个类似的答案，答案是可以接受的，但是答案中的代码是用八度音阶写的。为了不偷懒，我尝试将答案中的相关概念应用到我的案例中，据我所知，我正确地实现了这个函数。然而，我输出的成本与预期成本不同，所以我做错了什么

下面是一个可重复的小示例：

下面的链接指向一个

.npz

文件，可以加载该文件（如下所示）以获取相关数据。如果使用该文件，请重命名该文件“arrays.npz”

实际上，

cost

应该是0.287629，

cost+newCost

应该是0.383770

这是上面问题中公布的成本函数，仅供参考：

问题是您使用了错误的类标签。计算成本函数时，您需要使用基本真理，或真实类标签

我不确定你的Ynew阵列是什么，但它不是训练输出。因此，我将您的代码更改为使用Y代替Ynew作为类标签，并获得了正确的成本

import numpy as np

with np.load("arrays.npz") as data:

    thrLayer = data['thrLayer'] # The final layer post activation; you
    # can derive this final layer, if verification needed, using weights below

    thetaO = data['thetaO'] # The weight array between layers 1 and 2
    thetaT = data['thetaT'] # The weight array between layers 2 and 3

    Ynew = data['Ynew'] # The output array with a 1 in position i and 0s elsewhere

    #class i is the class that the data described by X[i,:] belongs to

    X = data['X'] #Raw data with 1s appended to the first column
    Y = data['Y'] #One dimensional column vector; entry i contains the class of entry i


m = len(thrLayer)
k = thrLayer.shape[1]
cost = 0

Y_arr = np.zeros(Ynew.shape)
for i in xrange(m):
    Y_arr[i,int(Y[i,0])-1] = 1

for i in range(m):
    for j in range(k):
        cost += -Y_arr[i,j]*np.log(thrLayer[i,j]) - (1 - Y_arr[i,j])*np.log(1 - thrLayer[i,j])
cost /= m

'''
Regularized Cost Component
'''

regCost = 0

for i in range(len(thetaO)):
    for j in range(1,len(thetaO[0])):
        regCost += thetaO[i,j]**2

for i in range(len(thetaT)):
    for j in range(1,len(thetaT[0])):
        regCost += thetaT[i,j]**2
lam=1
regCost *= lam/(2.*m)


print(cost)
print(cost + regCost)

这将产生：

0.287629165161
0.383769859091

编辑：修复了将regCost*=lam/（2*m）归零的整数除法错误。

您可以尝试此实现

import scipy.io
mat=scipy.io.loadmat('ex4data1.mat')
X=mat['X']
y=mat['y']

theta=scipy.io.loadmat('ex4weights.mat')
theta1=theta['Theta1']
theta2=theta['Theta2']
theta=[theta1,theta2]



new=np.zeros((10,len(y)))
for i in range(len(y)):
    new[y[i]-1,i]=1

y=new

def sigmoid(x):
    return 1/(1+np.exp(-x))

def reg_cost(theta,X,y,lambda1):
    current=X
    for i in range(len(theta)):
        a= np.append(np.ones((len(current),1)),current,axis=1)
        z=np.matmul(a,theta[i].T)
        z=sigmoid(z)
        current=z
    htheta=current
    ans=np.sum(np.multiply(np.log(htheta),(y).T)) + 
np.sum(np.multiply(np.log(1-htheta),(1-y).T))
    ans=-ans/len(X)
    for i in range(len(theta)):
        new=theta[i][:,1:]
        newsum=np.sum(np.multiply(new,new))
        ans+=newsum*(lambda1)/(2*len(X))

    return ans

print(reg_cost(theta,X,y,1))

它输出

0.3837698590909236

您使用的lambda值是多少？另外，为了确保成本的价值（我得到~10.441）？您能告诉我们您的预期成本来源吗？我非常相信您对成本函数的实现是正确的。我认为问题在于您对激活的计算。是的，

Ynew

与

Y\u arr

绝对不同，而且我认为您对

Y\u arr

的实现是正确的。原始的

数组有100个

s、100个

s等等，最多有100个

s。我根据第4章的评论将

s映射到索引0，该章使用了相同的数据集：“注意，该函数的y参数是1到10的标签向量，其中我们将数字\0映射到了标签10（以避免与索引混淆）。“我不确定这是否意味着与我所想的有什么不同？那么唯一的不确定因素是为什么

0.0

是正则化成本，而不是~0.38-0.28。你用regCost*=lam/（2*m）进行整数除法，然后乘以0。我把它修好了，得到了正确的价格。

0.3837698590909236