Machine learning 应用线性回归时，同一数据集的不同答案！为什么？_Machine Learning_Regression_Linear Regression_Data Science_Gradient Descent

Machine learning 应用线性回归时，同一数据集的不同答案！为什么？

machine-learning

Machine learning 应用线性回归时，同一数据集的不同答案！为什么？,machine-learning,regression,linear-regression,data-science,gradient-descent,Machine Learning,Regression,Linear Regression,Data Science,Gradient Descent,我已经为线性回归编写了如下代码：文件data.csv有两列：X和Y 这是我的密码 import numpy as np def gradientDescent(x, y, theta, alpha, m, numIterations): xTrans = x.T for i in range(numIterations): print 'Iteration : ',i+1 hypo = np.dot(x, theta) cos

我已经为线性回归编写了如下代码：文件data.csv有两列：X和Y

这是我的密码

import numpy as np

def gradientDescent(x, y, theta, alpha, m, numIterations):
    xTrans = x.T
    for i in range(numIterations):
        print 'Iteration : ',i+1 
        hypo = np.dot(x, theta)
        cost = np.sum((hypo-y)**2) / (2*m)
        print 'Cost : ', cost
        gradient = np.dot(xTrans, (hypo-y)) / m
        theta = theta - alpha * gradient
        print 'theta : ', theta
    return theta

data = np.loadtxt('/Users/Nikesh/Downloads/linear_regression_live-master/data.csv', delimiter=',')
x = data[:, 0:1]
y = data[:, 1:2]
a = np.ones((100,1))
x = np.append(a, x, axis=1)
m, n = np.shape(x) 
numIterations= 1000
alpha = 0.0005
theta = np.ones(n)
theta = theta[:, np.newaxis]
theta = gradientDescent(x, y, theta, alpha, m, numIterations)

1000次迭代后的最终输出：

迭代：1000成本：56.014846105θ：[[1.1395461][ 1.45709467]]

这里，我假设y=1.139+1.457X

同一数据集上的下一个代码：

import numpy as np
from sklearn import linear_model

data = np.loadtxt('/Users/Nikesh/Downloads/linear_regression_live-master/data.csv', delimiter=',')
regr = linear_model.LinearRegression()
regr.fit(data[:, 0:1], data[:, 1:2])
print 'Co-efficients : ', regr.coef_
print 'Intercept : ', regr.intercept_
print 'Regression line : ',regr.intercept_,'+',regr.coef_,' X'

输出为：

系数：[[1.32243102]]截距：[7.99102099] 回归线：[7.99102099]+[[1.32243102]]X

最后一个是我在网上找到的。在同一数据集上，应用线性回归（梯度下降Alg）

有人能帮我指出哪里出了问题吗

以下是数据集的链接：

当迭代次数是有限的时，θ的最终值依赖于它的初始值和α值。学习率很小，并且希望算法在迭代中不采取巨大的步骤，θ值收敛在中间某个地方（~500迭代）。当迭代次数为有限时，θ的最终值依赖于它的初始值和α值。学习率很小，并且我希望算法在迭代中不采取巨大的步骤，θ值收敛在中间某个地方（~500迭代）。从那时起就不会改变@frederick99