Java 线性使用格式

Java 线性使用格式,java,c#,visual-studio,syntax,liblinear,Java,C#,Visual Studio,Syntax,Liblinear,我在C#代码中通过以下nuget包使用liblinear的.NET实现: 但在liblinear的自述文件中,x的格式为: struct problem描述了该问题: struct problem { int l, n; int *y; struct feature_node **x; double bias; }; where `l` is the number of training data. If

我在C#代码中通过以下nuget包使用liblinear的.NET实现:

但在liblinear的自述文件中,x的格式为:

struct problem
描述了该问题:

    struct problem
    {
        int l, n;
        int *y;
        struct feature_node **x;
        double bias;
    };

where `l` is the number of training data. If bias >= 0, we assume
that one additional feature is added to the end of each data
instance. `n` is the number of feature (including the bias feature
if bias >= 0). `y` is an array containing the target values. (integers
in classification, real numbers in regression) And `x` is an array
of pointers, each of which points to a sparse representation (array
of feature_node) of one training vector.

For example, if we have the following training data:

    LABEL       ATTR1   ATTR2   ATTR3   ATTR4   ATTR5
    -----       -----   -----   -----   -----   -----
    1           0       0.1     0.2     0       0
    2           0       0.1     0.3    -1.2     0
    1           0.4     0       0       0       0
    2           0       0.1     0       1.4     0.5
    3          -0.1    -0.2     0.1     1.1     0.1

and bias = 1, then the components of problem are:

    l = 5
    n = 6

    y -> 1 2 1 2 3

    x -> [ ] -> (2,0.1) (3,0.2) (6,1) (-1,?)
         [ ] -> (2,0.1) (3,0.3) (4,-1.2) (6,1) (-1,?)
         [ ] -> (1,0.4) (6,1) (-1,?)
         [ ] -> (2,0.1) (4,1.4) (5,0.5) (6,1) (-1,?)
         [ ] -> (1,-0.1) (2,-0.2) (3,0.1) (4,1.1) (5,0.1) (6,1) (-1,?)
但是,在显示java实现的示例中:


因此,他没有按照liblinear的稀疏格式存储节点。有人知道x for liblinear实现的正确格式吗?

虽然它没有完全解决您提到的库,但我可以为您提供一个替代方案。这个 NET框架最近将LIBLINEAR的所有算法纳入了其机器学习中 名称空间。也是

在这个库中,从内存数据创建线性支持向量机的直接语法是

// Create a simple binary AND
// classification problem:

double[][] problem =
{
    //             a    b    a + b
    new double[] { 0,   0,     0    },
    new double[] { 0,   1,     0    },
    new double[] { 1,   0,     0    },
    new double[] { 1,   1,     1    },
};

// Get the two first columns as the problem
// inputs and the last column as the output

// input columns
double[][] inputs = problem.GetColumns(0, 1);

// output column
int[] outputs = problem.GetColumn(2).ToInt32();

// However, SVMs expect the output value to be
// either -1 or +1. As such, we have to convert
// it so the vector contains { -1, -1, -1, +1 }:
//
outputs = outputs.Apply(x => x == 0 ? -1 : 1);
问题产生后,可以使用

// Create a new linear-SVM for two inputs (a and b)
SupportVectorMachine svm = new SupportVectorMachine(inputs: 2);

// Create a L2-regularized L2-loss support vector classification
var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs)
{
    Loss = Loss.L2,
    Complexity = 1000,
    Tolerance = 1e-5
};

// Learn the machine
double error = teacher.Run(computeError: true);

// Compute the machine's answers for the learned inputs
int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x)));
但是,这假设您的数据已经在内存中。如果您希望从中加载数据 磁盘,来自libsvm稀疏格式的文件,您可以使用框架的。 有关如何使用它的示例如下所示:

// Suppose we are going to read a sparse sample file containing
//  samples which have an actual dimension of 4. Since the samples
//  are in a sparse format, each entry in the file will probably
//  have a much smaller number of elements.
// 
int sampleSize = 4;

// Create a new Sparse Sample Reader to read any given file,
//  passing the correct dense sample size in the constructor
// 
SparseReader reader = new SparseReader(file, Encoding.Default, sampleSize);

// Declare a vector to obtain the label
//  of each of the samples in the file
// 
int[] labels = null;

// Declare a vector to obtain the description (or comments)
//  about each of the samples in the file, if present.
// 
string[] descriptions = null;

// Read the sparse samples and store them in a dense vector array
double[][] samples = reader.ReadToEnd(out labels, out descriptions);
然后,可以使用
样本
标签
向量作为问题的输入和输出, 分别

我希望有帮助

免责声明:我是此库的作者。我回答这个问题是真诚地希望它 这对OP很有用,因为不久前我也遇到了同样的问题。如果主持人认为 这看起来像垃圾邮件,请随意删除。然而,我只是发布这个,因为我认为它可能 帮助别人。我甚至在搜索现有C#时错误地遇到了这个问题 LIBSVM的实现,而不是LIBLINEAR

// Create a new linear-SVM for two inputs (a and b)
SupportVectorMachine svm = new SupportVectorMachine(inputs: 2);

// Create a L2-regularized L2-loss support vector classification
var teacher = new LinearDualCoordinateDescent(svm, inputs, outputs)
{
    Loss = Loss.L2,
    Complexity = 1000,
    Tolerance = 1e-5
};

// Learn the machine
double error = teacher.Run(computeError: true);

// Compute the machine's answers for the learned inputs
int[] answers = inputs.Apply(x => Math.Sign(svm.Compute(x)));
// Suppose we are going to read a sparse sample file containing
//  samples which have an actual dimension of 4. Since the samples
//  are in a sparse format, each entry in the file will probably
//  have a much smaller number of elements.
// 
int sampleSize = 4;

// Create a new Sparse Sample Reader to read any given file,
//  passing the correct dense sample size in the constructor
// 
SparseReader reader = new SparseReader(file, Encoding.Default, sampleSize);

// Declare a vector to obtain the label
//  of each of the samples in the file
// 
int[] labels = null;

// Declare a vector to obtain the description (or comments)
//  about each of the samples in the file, if present.
// 
string[] descriptions = null;

// Read the sparse samples and store them in a dense vector array
double[][] samples = reader.ReadToEnd(out labels, out descriptions);