Java 支持向量机未按预期运行

Java 支持向量机未按预期运行,java,machine-learning,svm,Java,Machine Learning,Svm,我正在实现SVM,如下所述: 下面是它们的描述:(它只使用函数:f(x)=ax+by+c) 支持向量机“力规范”: “如果我们通过SVM电路输入一个正数据点,输出值为 小于1,以+1的力拉动电路。这是一个积极的例子,因此我们 希望得分更高 相反,如果我们通过SVM输入一个负数据点,那么输出是 大于-1,则电路会给此数据点危险的高分:用力-1向下拉动电路 除上述拉力外,始终在参数a、b上添加少量拉力(注意,不是在c上!),将其拉向零。您可以将a、b都视为连接到一个物理弹簧,该弹簧连接在零处。与物

我正在实现SVM,如下所述:

下面是它们的描述:(它只使用函数:f(x)=ax+by+c)

支持向量机“力规范”:

“如果我们通过SVM电路输入一个正数据点,输出值为 小于1,以+1的力拉动电路。这是一个积极的例子,因此我们 希望得分更高

相反,如果我们通过SVM输入一个负数据点,那么输出是 大于-1,则电路会给此数据点危险的高分:用力-1向下拉动电路

除上述拉力外,始终在参数a、b上添加少量拉力(注意,不是在c上!),将其拉向零。您可以将a、b都视为连接到一个物理弹簧,该弹簧连接在零处。与物理弹簧一样,这将使拉力按比例分配到每个a、b的值

例如,如果a变得非常高,它将经历一个强大的拉力,大小为| a |回到零。这种拉力我们称之为正则化,它确保我们的参数a或b都不会过大

这是不可取的,因为a、b都会乘以输入特征x、y(记住等式是ax+by+c),因此如果其中任何一个过高,我们的分类器将对这些特征过于敏感

这不是一个很好的特性,因为特征在实践中经常会有噪声,所以我们希望我们的分类器在它们摆动时能够相对平稳地改变。”

其数据由6(x,y)个点组成,每个点都有各自的标签:

var data = []; var labels = [];
data.push([1.2, 0.7]); labels.push(1);
data.push([-0.3, -0.5]); labels.push(-1);
data.push([3.0, 0.1]); labels.push(1);
data.push([-0.1, -1.0]); labels.push(-1);
data.push([-1.0, 1.1]); labels.push(-1);
data.push([2.1, -3]); labels.push(1);
这是他们的代码(可能是Javascript?)用于“电路”(用于反向传播)。它由一些加法门和乘法门组成,它们具有各自的反向传播特性(我真的不知道在这里说什么),但我想这就足够了。如果没有,您可以按照第1章反向传播中的链接进行操作:

// A circuit: it takes 5 Units (x,y,a,b,c) and outputs a single Unit
// It can also compute the gradient w.r.t. its inputs
var Circuit = function() {
  // create some gates
  this.mulg0 = new multiplyGate();
  this.mulg1 = new multiplyGate();
  this.addg0 = new addGate();
  this.addg1 = new addGate();
};
Circuit.prototype = {
  forward: function(x,y,a,b,c) {
    this.ax = this.mulg0.forward(a, x); // a*x
    this.by = this.mulg1.forward(b, y); // b*y
    this.axpby = this.addg0.forward(this.ax, this.by); // a*x + b*y
    this.axpbypc = this.addg1.forward(this.axpby, c); // a*x + b*y + c
    return this.axpbypc;
  },
  backward: function(gradient_top) { // takes pull from above
    this.axpbypc.grad = gradient_top;
    this.addg1.backward(); // sets gradient in axpby and c
    this.addg0.backward(); // sets gradient in ax and by
    this.mulg1.backward(); // sets gradient in b and y
    this.mulg0.backward(); // sets gradient in a and x
  }
}
它应该输出以下内容:

training accuracy at iteration 0: 0.3333333333333333
training accuracy at iteration 25: 0.3333333333333333
training accuracy at iteration 50: 0.5
training accuracy at iteration 75: 0.5
training accuracy at iteration 100: 0.3333333333333333
training accuracy at iteration 125: 0.5
training accuracy at iteration 150: 0.5
training accuracy at iteration 175: 0.5
training accuracy at iteration 200: 0.5
training accuracy at iteration 225: 0.6666666666666666
training accuracy at iteration 250: 0.6666666666666666
training accuracy at iteration 275: 0.8333333333333334
training accuracy at iteration 300: 1
training accuracy at iteration 325: 1
training accuracy at iteration 350: 1
training accuracy at iteration 375: 1 
我只是按照他们的指示去做,但对于“正规化”,我认为我并没有真正掌握。它说:给a和b一个朝向0的力,并与它们的值成比例。我的理解是:如果你给它一个力,你会随着时间改变它的梯度,这意味着你加上一个与它的值成比例的当前梯度。它位于SVM类中的函数“newGrad”中,由函数“regularization()”使用

我想这可能就是我出错的地方。然而,我也怀疑缺乏“随机梯度下降”

网站上的说明:

f(x,y)=ax+by+c
In this expression we think of x and y as the inputs (the 2D vectors) and a,b,c as the parameters of the function that we will want to learn. For example, if a = 1, b = -2, c = -1, then the function will take the first datapoint ([1.2, 0.7]) and output 1 * 1.2 + (-2) * 0.7 + (-1) = -1.2. Here is how the training will work:

We select a random datapoint and feed it through the circuit
We will interpret the output of the circuit as a confidence that the datapoint has class +1. (i.e. very high values = circuit is very certain datapoint has class +1 and very low values = circuit is certain this datapoint has class -1.)
We will measure how well the prediction aligns with the provided labels. Intuitively, for example, if a positive example scores very low, we will want to tug in the positive direction on the circuit, demanding that it should output higher value for this datapoint. Note that this is the case for the the first datapoint: it is labeled as +1 but our predictor unction only assigns it value -1.2. We will therefore tug on the circuit in positive direction; We want the value to be higher.
The circuit will take the tug and backpropagate it to compute tugs on the inputs a,b,c,x,y
Since we think of x,y as (fixed) datapoints, we will ignore the pull on x,y. If you're a fan of my physical analogies, think of these inputs as pegs, fixed in the ground.
On the other hand, we will take the parameters a,b,c and make them respond to their tug (i.e. we'll perform what we call a parameter update). This, of course, will make it so that the circuit will output a slightly higher score on this particular datapoint in the future.
Iterate! Go back to step 1.
The training scheme I described above, is commonly referred as Stochastic Gradient Descent.
我认为这与感知器算法是相同的

然而,在这部分之后,他们转向了支持向量机,我认为他们应该将其作为支持向量机的一部分,但他们只是说:“现在让我们用随机梯度下降法训练支持向量机”,但当我读到它时,我找不到任何东西:

var data = []; var labels = [];
data.push([1.2, 0.7]); labels.push(1);
data.push([-0.3, -0.5]); labels.push(-1);
data.push([3.0, 0.1]); labels.push(1);
data.push([-0.1, -1.0]); labels.push(-1);
data.push([-1.0, 1.1]); labels.push(-1);
data.push([2.1, -3]); labels.push(1);
var svm = new SVM();

// a function that computes the classification accuracy
var evalTrainingAccuracy = function() {
  var num_correct = 0;
  for(var i = 0; i < data.length; i++) {
    var x = new Unit(data[i][0], 0.0);
    var y = new Unit(data[i][1], 0.0);
    var true_label = labels[i];

    // see if the prediction matches the provided label
    var predicted_label = svm.forward(x, y).value > 0 ? 1 : -1;
    if(predicted_label === true_label) {
      num_correct++;
    }
  }
  return num_correct / data.length;
};

// the learning loop
for(var iter = 0; iter < 400; iter++) {
  // pick a random data point
  var i = Math.floor(Math.random() * data.length);
  var x = new Unit(data[i][0], 0.0);
  var y = new Unit(data[i][1], 0.0);
  var label = labels[i];
  svm.learnFrom(x, y, label);

  if(iter % 25 == 0) { // every 10 iterations... 
    console.log('training accuracy at iter ' + iter + ': ' + evalTrainingAccuracy());
  }
}
var数据=[];var标签=[];
数据推送([1.2,0.7]);标签。推送(1);
数据推送([-0.3,-0.5]);标签。推送(-1);
数据推送([3.0,0.1]);标签。推送(1);
数据推送([-0.1,-1.0]);标签。推送(-1);
数据推送([-1.0,1.1]);标签。推送(-1);
数据推送([2.1,-3]);标签。推送(1);
var-svm=新的svm();
//计算分类精度的函数
var evalTrainingAccuracy=函数(){
var num_correct=0;
对于(变量i=0;i0?1:-1;
如果(预测的\u标签===真实的\u标签){
num_correct++;
}
}
返回num_correct/data.length;
};
//学习循环
对于(var iter=0;iter<400;iter++){
//选择一个随机数据点
var i=Math.floor(Math.random()*data.length);
var x=新单位(数据[i][0],0.0);
var y=新单位(数据[i][1],0.0);
var标签=标签[i];
从(x,y,label)学习;
如果(iter%25==0){//每10次迭代。。。
log('iter'+iter+'的训练精度:'+evalTrainingAccuracy());
}
}
我想我错过了一些非常重要的事情。请帮帮我,非常感谢

以下是我的源代码(Java):

/*
*要更改此许可证标题,请在“项目属性”中选择“许可证标题”。
*要更改此模板文件,请选择工具|模板
*然后在编辑器中打开模板。
*/
包javaapplication7;
/**
*
*@作者张
*/
/*
*要更改此许可证标题,请在“项目属性”中选择“许可证标题”。
*要更改此模板文件,请选择工具|模板
*然后在编辑器中打开模板。
*/
//导入java.lang.Math;
导入java.util.Random;
/**
*
*@作者张
*/
公共类JavaApplication7{
/**
*@param指定命令行参数
*/
公共静态void main(字符串[]args){
//此处的TODO代码应用程序逻辑
支持向量机;
浮动台阶尺寸=0.001f;
浮拉、得分;
//数据推送([1.2,0.7]);标签推送(1);
//data.push([-0.3,-0.5]);labels.push(-1);
//data.push([3.0,0.1]);labels.push(1);
//data.push([-0.1,-1.0]);labels.push(-1);
//data.push([-1.0,1.1]);labels.push(-1);
//数据推送([2.1,-3]);标签推送(1);
float[]datax={1.2f,-0.3f,3.0f,-0.1f,-1.0f,2.1f};
float[]datay={0.7f,-0.5f,0.1f,-1.0f,1.1f,-3f};
布尔[]标签={true,false,true,false,false,true};
浮动精度、结果;
int i,长度=6,randint;
int iter,正确;
Random rand=新的Random();
布尔型,标签;
TestSVM=新的SVM();
TestSVM.init(1.0f,-2.0f,-1.0f);
浮点数a=1,b=-2,c=-1;
浮动x,y;
对于(iter=0;iter=0)
//类型=真;
/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */
package javaapplication7;

/**
 *
 * @author truong
 */
/*
 * To change this license header, choose License Headers in Project Properties.
 * To change this template file, choose Tools | Templates
 * and open the template in the editor.
 */


//import java.lang.Math;
import java.util.Random;

/**
 *
 * @author truong
 */
public class JavaApplication7 {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {
        // TODO code application logic here
        SVM TestSVM ;
        float step_size=0.001f;
        float pull,score;
        //data.push([1.2, 0.7]); labels.push(1);
        //data.push([-0.3, -0.5]); labels.push(-1);
        //data.push([3.0, 0.1]); labels.push(1);
        //data.push([-0.1, -1.0]); labels.push(-1);
        //data.push([-1.0, 1.1]); labels.push(-1);
        //data.push([2.1, -3]); labels.push(1);
        float[] datax  = {1.2f, -0.3f, 3.0f, -0.1f, -1.0f, 2.1f} ;
        float[] datay  = {0.7f, -0.5f, 0.1f, -1.0f,  1.1f,  -3f} ;
        boolean[] labels={true,false,true,false,false,true};
        float accuracy,result;
        int i,length = 6,randint;
        int iter,correct;
        Random rand= new Random();
        boolean type,label;
        TestSVM = new SVM();
        TestSVM.init(1.0f, -2.0f, -1.0f);
        float a =1, b =-2, c=-1;
        float x,y;
        for (iter=0; iter<= 400; iter++){
            randint = rand.nextInt(length);
            TestSVM.setSVM(datax[randint], datay[randint], labels[randint]);




            if (iter %25 ==0){            
                correct =0;
                for (i=0 ; i<length; i++){
                    result = TestSVM.calculate(datax[i],datay[i]);
                    //result = a * datax[i] + b * datay[i] + c;
                    type= (result>0);
                    //if (result >=0)
                    //    type = true;
                    //else
                    //    type = false;
                    if (type == labels[i])
                        correct++;
                }
                accuracy = ((float) correct ) / length;
                System.out.println("Training accuracy at the "+ iter +" iteration:"+ accuracy);
            }

        }

    }

}

class AddGate{
    private float gradx, grady;
    private float backx, backy;
    final float step = 0.0001f;
    public float calculate(float x, float y){
        return (x+y);
    }
    public void setGradx(float x, float y){
        float dif;
        dif = this.calculate(x+step,y) - this.calculate(x, y);
        gradx = (dif / step );
    }

    public void setGrady(float x, float y){
        float dif;
        dif = this.calculate(x,y +step) - this.calculate(x,y);
        grady = (dif / step);
    }   

    public float setAddGate(float x, float y){
        float val;
        val = calculate(x,y);
        setGradx(x,y);
        setGrady(x,y);
        return val;
    }

    public void backWard(float out){
        backx += gradx * out;
        backy += grady * out;
    }

    public float getBackx(){
        return backx;
    }

    public float getBacky(){
        return backy;
    }
}

class MulGate{
    private float gradq, gradz;
    private float backq, backz;
    private final float step = 0.0001f;
    public float calculate(float q, float z){
        return (q*z);
    }
    public void setGradq(float q, float z){
        float dif;
        dif = this.calculate(q+step,z) - this.calculate(q, z);
        gradq = (dif / step );
    }

    public void setGradz(float q, float z){
        float dif;
        dif = this.calculate(q,z +step) - this.calculate(q,z);
        gradz = (dif / step);
    }

    public float setMulGate(float q, float z){
        float val;
        val =calculate(q,z);
        setGradq(q,z);
        setGradz(q,z);
        return val;
    }

    public void backWard(float out){
        backq += gradq * out;
        backz += gradz * out;
    }    

    public float getBackq(){
        return backq;
    }

    public float getBackz(){
        return backz;
    }
}

class SigmoidGate{
    private float backx, backy, backz;
    private AddGate AGate = new AddGate();
    private MulGate MGate = new MulGate();

    public void SigmoidGate(){
        AGate = new AddGate();
        MGate = new MulGate();
    }

    public float calculate (float x, float y, float z){
        float q;
        q = AGate.calculate(x,y);
        return MGate.calculate(q,z);
    }

    public void setBackProp(float x, float y, float z){
        float q;
        float tmp;
        q=AGate.setAddGate(x,y);
        MGate.setMulGate(q,z);
        MGate.backWard(1);
        AGate.backWard(MGate.getBackq());
        backx = AGate.getBackx();
        backy = AGate.getBacky();
        backz = MGate.getBackz();
    }

    public float getBackx(){
        return backx;
    }

    public float getBacky(){
        return backy;
    }

    public float getBackz(){
        return backz;
    }

}

class GoodGate{
    private float backa,backb,backc;
    private float grada,gradb,gradc;
    private MulGate ax,by;
    private AddGate axPby,axPbyPc;          //ax + by and ax+by+c

    public void GoodGate(){
        ax = new MulGate();
        by = new MulGate();
        axPby = new AddGate();
        axPbyPc = new AddGate();
    }

    public float calculate(float a, float x, float b, float y, float c){
        return (a*x + b*y + c  );
    }

    public float setBackProp(float a, float x, float b, float y, float c, float pull){
        float val,tmp1,tmp2,tmp3;

        ax = new MulGate();
        by = new MulGate();
        axPby = new AddGate();
        axPbyPc = new AddGate();
        val = calculate(a,x,b,y,c);
        //forward pass
        tmp1 = ax.setMulGate(a, x);
        tmp2 = by.setMulGate(b, y);
        tmp3 = axPby.setAddGate(tmp1,tmp2);
        axPbyPc.setAddGate(tmp3, c);
        //backward pass
        axPbyPc.backWard(pull);
        backc = axPbyPc.getBacky();
        axPby.backWard( axPbyPc.getBackx() );
        ax.backWard(axPby.getBackx() );
        by.backWard(axPby.getBacky() );
        backa = ax.getBackq();
        backb = by.getBackq();              

        return val;
    }

    public float getBacka(){
        return backa;
    }

    public float getBackb(){
        return backb;
    }

    public float getBackc(){
        return backc;
    }
}

class SVM{
    private GoodGate GGate;
    private float grada,gradb,gradc;
    private float a,b,c;

    public void SVM(){
        GGate = new GoodGate();
    }
    private float newGrad(float current,float curGrad){     //not very good
        float sign;
        if (current<0)
            sign = 1;
        else
            sign = -1;
        return (curGrad+  sign* Math.abs(current)) ; 
    }


    public float calculate(float x, float y){
        return GGate.calculate(a, x, b, y, c);
    }

    public void init(float firsta, float firstb, float firstc){
        a=firsta; b= firstb; c= firstc;
        grada = 0; gradb = 0; gradc = 0;
    }

    private void regularization(){
        grada = newGrad(a,grada);
        gradb = newGrad(b,gradb);
    }

    private void gradPull(float x, float y, float pull){
        GGate.setBackProp(a,x,b,y,c,pull);
        grada += GGate.getBacka();
        gradb += GGate.getBackb();
        gradc += GGate.getBackc();
    }

    public void setSVM( float x,  float y, boolean label){
        float step_size = 0.01f;
        float val;
        float pull;
        GGate = new GoodGate();
        val = GGate.calculate(a,x,b,y,c);
        //grada=0;gradb=0; gradc=0;
        if (label == true){
            if (val < 1){
                pull = 1;
            }
            else 
            {
                pull = 0;
            }
        }
        else{
            if (val> -1){
                pull = -1;
            }
            else
            {
                pull = 0;
            }
        }
        gradPull(x,y,pull);
        //regularization();
        updateSVM();
        //a += step_size * (x * pull* GGate.getBacka() - a); // -a is from the regularization
        //b += step_size * (y * pull* GGate.getBackb() - b); // -b is from the regularization
        //c += step_size * (1 * pull* GGate.getBackc() );

    }
    private void updateSVM(){
        float step_size=0.001f;
        //System.out.println("grad a is"+ grada);
        //System.out.println("grad b is"+ gradb);
        //System.out.println("grad c is"+ gradc);
        a += grada * step_size;
        b += gradb * step_size;
        c += gradc * step_size;
        //System.out.println("a is "+ a);
        //System.out.println("b is "+ b);
        //System.out.println("c is "+ c);
        //System.out.println("-----------")
;    }

    public float getA(){
        return a;
    }

    public float getB(){
        return b;
    }

}