Java中的梯度下降线性回归_Java_Machine Learning_Linear Regression

Java中的梯度下降线性回归

java machine-learning

Java中的梯度下降线性回归,java,machine-learning,linear-regression,Java,Machine Learning,Linear Regression,这有点遥不可及，但我想知道是否有人可以看看这个。我在这里做线性回归的批梯度下降正确吗？它给出了单个独立变量和截距的预期答案，但没有给出多个独立变量的预期答案 /** * (using Colt Matrix library) * @param alpha Learning Rate * @param thetas Current Thetas * @param independent * @param dependent * @return new Thetas */ publ

这有点遥不可及，但我想知道是否有人可以看看这个。我在这里做线性回归的批梯度下降正确吗？它给出了单个独立变量和截距的预期答案，但没有给出多个独立变量的预期答案

/**
 * (using Colt Matrix library)
 * @param alpha Learning Rate
 * @param thetas Current Thetas
 * @param independent 
 * @param dependent
 * @return new Thetas
 */
public DoubleMatrix1D descent(double         alpha,
                              DoubleMatrix1D thetas,
                              DoubleMatrix2D independent,
                              DoubleMatrix1D dependent ) {
    Algebra algebra     = new Algebra();

    // ALPHA*(1/M) in one.
    double  modifier    = alpha / (double)independent.rows();

    //I think this can just skip the transpose of theta.
    //This is the result of every Xi run through the theta (hypothesis fn)
    //So each Xj feature is multiplied by its Theata, to get the results of the hypothesis
    DoubleMatrix1D hypothesies = algebra.mult( independent, thetas );

    //hypothesis - Y  
    //Now we have for each Xi, the difference between predictect by the hypothesis and the actual Yi
    hypothesies.assign(dependent, Functions.minus);

    //Transpose Examples(MxN) to NxM so we can matrix multiply by hypothesis Nx1
    DoubleMatrix2D transposed = algebra.transpose(independent);

    DoubleMatrix1D deltas     = algebra.mult(transposed, hypothesies );


    // Scale the deltas by 1/m and learning rate alhpa.  (alpha/m)
    deltas.assign(Functions.mult(modifier));

    //Theta = Theta - Deltas
    thetas.assign( deltas, Functions.minus );

    return( thetas );
}

在您的实现中没有任何错误，并且根据您的注释，

共线性

中的问题是您在生成

x2

时导致的。这在回归估计中是有问题的

要测试算法，可以生成两列独立的随机数。选择

w0

、

w1

和

w2

的值，即分别为

截距

、

x1

和

x2

选择系数。计算相关值

然后看看你的随机/批量梯度下降算法是否能恢复

w0

，

w1

和

w2

值，我认为

  // ALPHA*(1/M) in one.
double  modifier    = alpha / (double)independent.rows();

这是一个坏主意，因为您将梯度函数与梯度下降算法混合，所以最好在公共方法中使用梯度下降算法，如Java中的以下方法：

import org.la4j.Matrix;
import org.la4j.Vector;

public Vector gradientDescent(Matrix x, Matrix y, int kmax, double alpha)
{
    int k=1;
    Vector  thetas = Vector.fromArray(new double[] { 0.0, 0.0});
    while (k<kmax)
    {
        thetas = thetas.subtract(gradient(x, y, thetas).multiply(alpha));
        k++;
    }
    return thetas;
}

import org.la4j.Matrix；
导入org.la4j.Vector；
公共向量梯度下降（矩阵x，矩阵y，int-kmax，双alpha）
{
int k=1；
向量θ=向量.fromArray（新的双精度[]{0.0,0.0}）；
虽然（对于算法步骤和数学，我看不出有任何错误。我不熟悉Colt
库，但我假设函数名是表达性的，含义是明确的。我假设你有独立
矩阵的第一列一个向量来估计截距。在这种情况下，值是如何不同的关于多元回归

？第一列是1的截距。我认为这可能是正确的，我在测试数据中体验到了共线性。我创建了测试数据，这样我就得到了x1和x2，x2只有2*x1。我将因变量设置为y=.5*x1+（1/3）*x2.它收敛了，但没有达到我预期的效果。例如，在上面的例子中，我得到了θ为.6333（x1）和.2666（x2）。它确实正确地选择了我输入函数的任何截距…（例如y=.5*x1+（1/3）*x2+10）。如果我在同一个数据集上使用WEKA，它会自动处理共线性，只处理1.1666*x1。