C#ML预测奇异值

C#ML预测奇异值,c#,machine-learning,ml.net,C#,Machine Learning,Ml.net,我最近对C#中的机器学习产生了兴趣。我从微软网站下载了示例代码,并想对其进行测试 代码: using System; using Microsoft.ML; using Microsoft.ML.Data; class Program { public class HouseData { public float Size { get; set; } public float Price { get; set; } } publ

我最近对C#中的机器学习产生了兴趣。我从微软网站下载了示例代码,并想对其进行测试

代码:

using System;
using Microsoft.ML;
using Microsoft.ML.Data;

class Program
{
    public class HouseData
    {
        public float Size { get; set; }
        public float Price { get; set; }
    }

    public class Prediction
    {
        [ColumnName("Score")]
        public float Price { get; set; }
    }

    static void Main(string[] args)
    {
        MLContext mlContext = new MLContext();

        // 1. Import or create training data
        HouseData[] houseData = {
               new HouseData() { Size = 100, Price = 10 },
               new HouseData() { Size = 200, Price = 20 },
               new HouseData() { Size = 300, Price = 30 },
               new HouseData() { Size = 400, Price = 40 },
               new HouseData() { Size = 500, Price = 50 },
               new HouseData() { Size = 600, Price = 60 },
               new HouseData() { Size = 700, Price = 70 },
               new HouseData() { Size = 800, Price = 80 } };
        IDataView trainingData = mlContext.Data.LoadFromEnumerable(houseData);

        // 2. Specify data preparation and model training pipeline
        var pipeline = mlContext.Transforms.Concatenate("Features", new[] { "Size" })
            .Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Price", 
       maximumNumberOfIterations: 100));

        // 3. Train model
        var model = pipeline.Fit(trainingData);

        // 4. Make a prediction
        var size = new HouseData() { Size = 400 };
        var price = mlContext.Model.CreatePredictionEngine<HouseData, Prediction>(model).Predict(size);

        Console.WriteLine($"Predicted price for size: {size.Size} is {price.Price}");
    }
}

在我看来,程序返回数字40是合乎逻辑的。我怎么会犯结果完全不同的错误呢?

关于不同的结果:

using System;
using Microsoft.ML;
using Microsoft.ML.Data;

class Program
{
    public class HouseData
    {
        public float Size { get; set; }
        public float Price { get; set; }
    }

    public class Prediction
    {
        [ColumnName("Score")]
        public float Price { get; set; }
    }

    static void Main(string[] args)
    {
        MLContext mlContext = new MLContext();

        // 1. Import or create training data
        HouseData[] houseData = {
               new HouseData() { Size = 100, Price = 10 },
               new HouseData() { Size = 200, Price = 20 },
               new HouseData() { Size = 300, Price = 30 },
               new HouseData() { Size = 400, Price = 40 },
               new HouseData() { Size = 500, Price = 50 },
               new HouseData() { Size = 600, Price = 60 },
               new HouseData() { Size = 700, Price = 70 },
               new HouseData() { Size = 800, Price = 80 } };
        IDataView trainingData = mlContext.Data.LoadFromEnumerable(houseData);

        // 2. Specify data preparation and model training pipeline
        var pipeline = mlContext.Transforms.Concatenate("Features", new[] { "Size" })
            .Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Price", 
       maximumNumberOfIterations: 100));

        // 3. Train model
        var model = pipeline.Fit(trainingData);

        // 4. Make a prediction
        var size = new HouseData() { Size = 400 };
        var price = mlContext.Model.CreatePredictionEngine<HouseData, Prediction>(model).Predict(size);

        Console.WriteLine($"Predicted price for size: {size.Size} is {price.Price}");
    }
}
请参阅添加的有关
种子的文档

以及关于具体的SDCA选项:

对于可再现的结果,建议将“Shuffle”设置为False,将“NumThreads”设置为1

Net中的许多操作是不确定的。您之所以看到这一点,是因为您在每个程序运行时都在训练模型,而且由于上面强调的因素,训练本身是不确定的

因此,如果需要使用确定性输出重新训练模型,可以在
MLContext
构造函数中设置
seed
参数,
Shuffle
false
NumThreads
1

通常,对于回归等有监督学习技术,训练将与预测分开执行,将训练后的模型存储在状态中,以便随后用于多个预测

与您的预期值40:

using System;
using Microsoft.ML;
using Microsoft.ML.Data;

class Program
{
    public class HouseData
    {
        public float Size { get; set; }
        public float Price { get; set; }
    }

    public class Prediction
    {
        [ColumnName("Score")]
        public float Price { get; set; }
    }

    static void Main(string[] args)
    {
        MLContext mlContext = new MLContext();

        // 1. Import or create training data
        HouseData[] houseData = {
               new HouseData() { Size = 100, Price = 10 },
               new HouseData() { Size = 200, Price = 20 },
               new HouseData() { Size = 300, Price = 30 },
               new HouseData() { Size = 400, Price = 40 },
               new HouseData() { Size = 500, Price = 50 },
               new HouseData() { Size = 600, Price = 60 },
               new HouseData() { Size = 700, Price = 70 },
               new HouseData() { Size = 800, Price = 80 } };
        IDataView trainingData = mlContext.Data.LoadFromEnumerable(houseData);

        // 2. Specify data preparation and model training pipeline
        var pipeline = mlContext.Transforms.Concatenate("Features", new[] { "Size" })
            .Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Price", 
       maximumNumberOfIterations: 100));

        // 3. Train model
        var model = pipeline.Fit(trainingData);

        // 4. Make a prediction
        var size = new HouseData() { Size = 400 };
        var price = mlContext.Model.CreatePredictionEngine<HouseData, Prediction>(model).Predict(size);

        Console.WriteLine($"Predicted price for size: {size.Size} is {price.Price}");
    }
}
看起来你在期待线性回归的结果。
SdcaRegressionTrainer
与线性回归不同-这反映在您看到的结果中


有关
SdcaRegressionTrainer

的更多详细信息,请参阅。如果需要确定性输出,请尝试设置
种子
新建MLContext(种子:0)
请阅读标记的说明。我尝试使用
种子
并运行程序数次,它返回的值约为31。很接近,但不是这样yet@Adam我认为“怪异”指的是不同的结果。我已经根据预期结果更新了答案。