Machine learning 使用矢量化的梯度下降倍频程代码未正确更新代价函数

Machine learning 使用矢量化的梯度下降倍频程代码未正确更新代价函数,machine-learning,octave,vectorization,gradient-descent,Machine Learning,Octave,Vectorization,Gradient Descent,我已经使用矢量化实现了下面的梯度下降代码,但似乎代价函数没有正确地递减。相反,代价函数随着每次迭代而递增 假设θ为n+1向量,y为m向量,X为设计矩阵m*(n+1) 计算成本函数为: function J = computeCost(X, y, theta) m = length(y); J = 0; for i = 1:m, H = theta' * X(i,:)'; E = H - y(i); SQE = E^2; J = (J + SQE); i = i+1;

我已经使用矢量化实现了下面的梯度下降代码,但似乎代价函数没有正确地递减。相反,代价函数随着每次迭代而递增

假设θ为n+1向量,y为m向量,X为设计矩阵m*(n+1)

计算成本函数为:

function J = computeCost(X, y, theta)
m = length(y);
J = 0;
for i = 1:m,
   H = theta' * X(i,:)';
   E = H - y(i);
   SQE = E^2;
   J = (J + SQE);
   i = i+1;
end;
J = J / (2*m);

您可以进一步将其矢量化:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
    m = length(y); 
    J_history = zeros(num_iters, 1);

    for iter = 1:num_iters

       delta = (theta' * X'-y')*X;
       theta = theta - alpha/m*delta';
       J_history(iter) = computeCost(X, y, theta);

    end

end

您可以进一步将其矢量化:

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
    m = length(y); 
    J_history = zeros(num_iters, 1);

    for iter = 1:num_iters

       delta = (theta' * X'-y')*X;
       theta = theta - alpha/m*delta';
       J_history(iter) = computeCost(X, y, theta);

    end

end

您可以按如下方式更好地将其矢量化

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
  m = length(y);
  J_history = zeros(num_iters, 1);

  for iter = 1:num_iters

     theta=theta-(alpha/m)*((X*theta-y)'*X)';
     J_history(iter) = computeCost(X, y, theta);

  end;
end;
ComputeCost函数可以写成

function J = computeCost(X, y, theta)
  m = length(y); 

  J = 1/(2*m)*sum((X*theta-y)^2);

end;

您可以按如下方式更好地将其矢量化

function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters)
  m = length(y);
  J_history = zeros(num_iters, 1);

  for iter = 1:num_iters

     theta=theta-(alpha/m)*((X*theta-y)'*X)';
     J_history(iter) = computeCost(X, y, theta);

  end;
end;
ComputeCost函数可以写成

function J = computeCost(X, y, theta)
  m = length(y); 

  J = 1/(2*m)*sum((X*theta-y)^2);

end;

对于i=1:n,不应该
增量
i
吗?你也在循环中做这件事。(我已经做了很长时间的八度音阶了…)难道我不应该为I=1:n而
increment
I
吗?你也在循环中做这件事。(很长一段时间以来,我没有做任何八度…)请考虑更详细地解释你的答案,而不是仅仅提供代码来解决这个问题。请考虑更彻底地解释你的答案,而不是仅仅提供代码来解决这个问题。