Python 3.x 梯度下降代价收敛

Python 3.x 梯度下降代价收敛,python-3.x,numpy,gradient-descent,Python 3.x,Numpy,Gradient Descent,我有一个小脚本,它使数据集xa和ya的成本收敛到零,但无论我使用什么值作为“迭代次数”和“学习率”,使用数据集xb和yb时,我能得到的最佳成本是31.604 我的问题是:成本应该总是趋向于零吗?如果是,那么关于数据集xb和yb,我做错了什么 import numpy as np def gradient_descent(x, y): m_curr = b_curr = 0 iterations = 1250 n = len(x) learning_rate =



import numpy as np

def gradient_descent(x, y):
    m_curr = b_curr = 0
    iterations = 1250
    n = len(x)
    learning_rate = 0.08

    for i in range(iterations):
        y_predicted = (m_curr * x) + b_curr
        cost = (1/n) * sum([val**2 for val in (y - y_predicted)])
        m_der = -(2/n) * sum(x * (y - y_predicted))
        b_der = -(2/n) * sum(y - y_predicted)
        m_curr = m_curr - (learning_rate * m_der)
        b_curr = b_curr - (learning_rate * b_der)
        print('m {}, b {}, cost {}, iteration {}'.format(m_curr, b_curr, cost, i))

xa = np.array([1, 2, 3, 4, 5])
ya = np.array([5, 7, 9, 11, 13])

# xb = np.array([92, 56, 88, 70, 80, 49, 65, 35, 66, 67])
# yb = np.array([98, 68, 81, 80, 83, 52, 66, 30, 68, 73])

gradient_descent(xa, ya)

# gradient_descent(xb, yb)






在你的xa,ya数据中,它能够以误差~0(或)平衡方程的两边。 但在xb,yb的情况下,它只能求解误差为~31的问题

The cost is nothing but the mean error the gradient descent finds while balancing the equation. 
Manually try calculating both sides of the equation, it will become clear.


m 1.0445229983270568, b 0.01691112775956422, cost 31.811378572605147, iteration 995
m 1.0445229675787642, b 0.01691330681124408, cost 31.81137809768319, iteration 996
m 1.044522936830507, b 0.016915485860422623, cost 31.811377622762304, iteration 997
m 1.044522906082285, b 0.016917664907099856, cost 31.811377147842503, iteration 998
m 1.0445228753340983, b 0.01691984395127578, cost 31.811376672923775, iteration 999
m 1.017952329085966, b 1.8999054866690825, cost 31.604524796644444, iteration 199995
m 1.0179523238769337, b 1.8999058558198456, cost 31.60452479599536, iteration 199996
m 1.0179523186680224, b 1.89990622496171, cost 31.604524795346318, iteration 199997
m 1.017952313459241, b 1.899906594094676, cost 31.60452479469731, iteration 199998
m 1.017952308250581, b 1.8999069632187437, cost 31.604524794048356, iteration 199999
The cost is nothing but the mean error the gradient descent finds while balancing the equation. 
Manually try calculating both sides of the equation, it will become clear.