Python 训练神经网络损失不减少

Python 训练神经网络损失不减少,python,neural-network,deep-learning,Python,Neural Network,Deep Learning,我使用python编写了一个简单的神经网络,以此来测试我对神经网络的理解。我的问题是,当训练神经网络时,它的损失并没有减少。我使用最小和平方作为损失函数。有神经网络经验的人能告诉我哪里出了问题吗 详细信息: 我试图在一个简单的2d数据集上训练网络,其中每个点都是两个类别之一的一部分。我的数据集类似于tensorflow提供的交互式神经网络工具中使用的数据集: 我使用一个隐藏层和三个隐藏单元,以及一个输出层和一个单元。损失函数是平方和。我使用以下来源帮助我了解反向传播的工作原理: 编辑 每个

我使用
python
编写了一个简单的神经网络,以此来测试我对神经网络的理解。我的问题是,当训练神经网络时,它的损失并没有减少。我使用最小和平方作为损失函数。有神经网络经验的人能告诉我哪里出了问题吗

详细信息:

我试图在一个简单的2d数据集上训练网络,其中每个点都是两个类别之一的一部分。我的数据集类似于tensorflow提供的交互式神经网络工具中使用的数据集:

我使用一个隐藏层和三个隐藏单元,以及一个输出层和一个单元。损失函数是平方和。我使用以下来源帮助我了解反向传播的工作原理:

编辑

每个训练步骤的损失函数输出:

sum of squares loss: 9.41904563854931
sum of squares loss: 9.466209959933774
sum of squares loss: 9.521526062849716
sum of squares loss: 9.586899865148004
sum of squares loss: 9.664794367157919
sum of squares loss: 9.758420389666265
sum of squares loss: 9.871998308092504
sum of squares loss: 10.01111579994052
sum of squares loss: 10.183210868157662
sum of squares loss: 10.398206323346686
sum of squares loss: 10.669295283053922
sum of squares loss: 11.01378465075701
sum of squares loss: 11.453638064297865
sum of squares loss: 12.014673540964441
sum of squares loss: 12.72180795275546
sum of squares loss: 13.585099304548
sum of squares loss: 14.571325340888313
sum of squares loss: 15.575875071643217
sum of squares loss: 16.450858359675404
sum of squares loss: 17.09492556599039
sum of squares loss: 17.50379115136118
sum of squares loss: 17.737928687252197
sum of squares loss: 17.86441528900727
sum of squares loss: 17.93065857787634
sum of squares loss: 17.964762961628182
sum of squares loss: 17.982154959175745
sum of squares loss: 17.99097886266222
sum of squares loss: 17.995443731332028
sum of squares loss: 17.997699846969052
sum of squares loss: 17.998839078868475
sum of squares loss: 17.99941413509295
sum of squares loss: 17.999704357817745
sum of squares loss: 17.999850815993767
sum of squares loss: 17.999924721397637
sum of squares loss: 17.99996201452985
sum of squares loss: 17.999980832662573
sum of squares loss: 17.99999032824638
sum of squares loss: 17.999995119680737
sum of squares loss: 17.999997537416142
sum of squares loss: 17.999998757393232
sum of squares loss: 17.999999372987293
sum of squares loss: 17.99999968361277
sum of squares loss: 17.999999840352714
sum of squares loss: 17.999999919442843
sum of squares loss: 17.9999999593513
sum of squares loss: 17.999999979488887
sum of squares loss: 17.999999989650206
sum of squares loss: 17.99999999477755
sum of squares loss: 17.99999999736478
sum of squares loss: 17.999999998670287
sum of squares loss: 17.99999999932903
sum of squares loss: 17.999999999661433
sum of squares loss: 17.999999999829164
sum of squares loss: 17.999999999913797
sum of squares loss: 17.999999999956504
sum of squares loss: 17.99999999997805
sum of squares loss: 17.999999999988926
sum of squares loss: 17.99999999999441
sum of squares loss: 17.99999999999718
sum of squares loss: 17.99999999999858
sum of squares loss: 17.999999999999282
sum of squares loss: 17.999999999999638
sum of squares loss: 17.999999999999815
sum of squares loss: 17.999999999999908
sum of squares loss: 17.99999999999995
sum of squares loss: 17.99999999999998
sum of squares loss: 17.999999999999993
sum of squares loss: 17.999999999999993
sum of squares loss: 18.0
sum of squares loss: 18.0
sum of squares loss: 18.0
sum of squares loss: 18.0
sum of squares loss: 18.0
sum of squares loss: 18.0
编辑2

输入数据:

3,4 -> 0
2,6 -> 0
1,6 -> 0
1,1 -> 1
2,1 -> 1
3,1 -> 1

错误似乎出现在此行的函数backpropagate_step2中:

dz = np.multiply(sigmoid(z_prev) - (1 - sigmoid(z_prev)), back_value)
应该是

dz = np.multiply(sigmoid(z_prev)*(1 - sigmoid(z_prev)), back_value)

因为sigmoid(x)的导数是sigmoid(x)*(1-sigmoid(x))而不是sigmoid(x)-(1-sigmoid(x))。

除了tensorflow链接之外,发布一些您处理的实际数据会很有帮助。仅仅通过读取源代码来调试神经网络是相当困难的。您是否按设定的时间步长跟踪损失函数?你能说明损失函数实际上在做什么吗?您是否通过执行渐变检查来检查渐变?最后,通常对于二元分类,我们使用交叉熵损失而不是均方误差损失。你可以使用均方损失,它不应该破坏神经网络,但我只是想让你知道,在这种情况下,交叉熵更受欢迎。@enumaris我在每次训练中都添加了损失函数step@JagrutSharma我硬编码了数据。我把它单独添加到帖子中,谢谢你的建议!!不幸的是,我仍然看到同样的行为。我不知道为什么
dz = np.multiply(sigmoid(z_prev)*(1 - sigmoid(z_prev)), back_value)