C# C语言中两个数组的相关性#

C# C语言中两个数组的相关性#,c#,.net,arrays,correlation,C#,.net,Arrays,Correlation,有两个双值数组,我想计算相关系数(单双值,就像MS Excel中的CORREL函数一样)。在C#中有简单的单线解吗 我已经发现了一个叫做MetaNumerics的数学库。根据中国的经验,它应该能完成这项工作。这是一个用于元数据相关方法的文档,我不知道 请有人给我提供一个简单的代码片段或示例如何使用这个库 注意:最后,我被迫使用一个自定义实现。 但是如果有人读过这个问题,知道很好的,有很好文档证明的C# 数学库/框架要做到这一点,请不要犹豫,在 回答 如果不想使用第三方库,可以使用中的方法(在此处

有两个双值数组,我想计算相关系数(单双值,就像MS Excel中的CORREL函数一样)。在C#中有简单的单线解吗

我已经发现了一个叫做MetaNumerics的数学库。根据中国的经验,它应该能完成这项工作。这是一个用于元数据相关方法的文档,我不知道

请有人给我提供一个简单的代码片段或示例如何使用这个库

注意:最后,我被迫使用一个自定义实现。 但是如果有人读过这个问题,知道很好的,有很好文档证明的C# 数学库/框架要做到这一点,请不要犹豫,在 回答


如果不想使用第三方库,可以使用中的方法(在此处发布代码以进行备份)

公共双相关(双[]数组1,双[]数组2)
{
double[]数组_xy=新的double[array1.长度];
double[]数组_xp2=新的double[array1.Length];
double[]数组_yp2=新的double[array1.Length];
for(int i=0;i
您可以在同一索引的不同列表中使用值,并使用简单的

用法:

var values1 = new List<double> { 3, 2, 4, 5 ,6 };
var values2 = new List<double> { 9, 7, 12 ,15, 17 };

var result = ComputeCoeff(values1.ToArray(), values2.ToArray());
// 0.997054485501581

Debug.Assert(result.ToString("F6") == "0.997054");
var values1=新列表{3,2,4,5,6};
var values2=新列表{9,7,12,15,17};
var result=ComputeCoeff(值1.ToArray(),值2.ToArray());
// 0.997054485501581
Assert(result.ToString(“F6”)=“0.997054”);
另一种方法是直接使用Excel函数:

var values1 = new List<double> { 3, 2, 4, 5 ,6 };
var values2 = new List<double> { 9, 7, 12 ,15, 17 };

// Make sure to add a reference to Microsoft.Office.Interop.Excel.dll
// and use the namespace

var application = new Application();

var worksheetFunction = application.WorksheetFunction;

var result = worksheetFunction.Correl(values1.ToArray(), values2.ToArray());

Console.Write(result); // 0.997054485501581
var values1=新列表{3,2,4,5,6};
var values2=新列表{9,7,12,15,17};
//确保添加对Microsoft.Office.Interop.Excel.dll的引用
//并使用名称空间
var application=新应用程序();
var worksheetFunction=application.worksheetFunction;
var result=worksheetFunction.Correl(值1.ToArray(),值2.ToArray());
控制台。写入(结果);//0.997054485501581

以计算皮尔逊积矩相关系数

您可以使用以下简单代码:

  public static Double Correlation(Double[] Xs, Double[] Ys) {
    Double sumX = 0;
    Double sumX2 = 0;
    Double sumY = 0;
    Double sumY2 = 0;
    Double sumXY = 0;

    int n = Xs.Length < Ys.Length ? Xs.Length : Ys.Length;

    for (int i = 0; i < n; ++i) {
      Double x = Xs[i];
      Double y = Ys[i];

      sumX += x;
      sumX2 += x * x;
      sumY += y;
      sumY2 += y * y;
      sumXY += x * y;
    }

    Double stdX = Math.Sqrt(sumX2 / n - sumX * sumX / n / n);
    Double stdY = Math.Sqrt(sumY2 / n - sumY * sumY / n / n);
    Double covariance = (sumXY / n - sumX * sumY / n / n);

    return covariance / stdX / stdY; 
  }
公共静态双相关(双[]Xs,双[]Ys){
双sumX=0;
双sumX2=0;
双sumY=0;
双sumY2=0;
双sumXY=0;
int n=Xs.Length
Math.NET Numerics是一个文档丰富的数学库,包含一个关联类。它计算皮尔逊和斯皮尔曼的排名相关性:

该库在非常宽松的MIT/X11许可下可用。使用它计算相关系数如下所示:

using MathNet.Numerics.Statistics;

...

correlation = Correlation.Pearson(arrayOfValues1, arrayOfValues2);

祝你好运

在我的测试中,@Dmitry Bychenko和@keyboardP在上面的代码发布导致了与Microsoft Excel在我做的一些手动测试中基本相同的相关性,并且不需要任何外部库

e、 g.运行一次(底部列出了此运行的数据):

@Dmitry Bychenko:-0.00418479432051211

@键盘p:\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu

MS Excel:\uuuuuuuuuuuuuuuuuuuuuuuuuuu-0.004184794

这是一个测试线束:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace TestCorrel {
    class Program {

        static void Main(string[] args) {

            Random rand = new Random(DateTime.Now.Millisecond);

            List<double> x = new List<double>();
            List<double> y = new List<double>();

            for (int i = 0; i < 100; i++) {

                x.Add(rand.Next(1000) * rand.NextDouble());
                y.Add(rand.Next(1000) * rand.NextDouble());

                Console.WriteLine(x[i] + "," + y[i]);
            }

            Console.WriteLine("Correl1: " + Correl1(x, y));
            Console.WriteLine("Correl2: " + Correl2(x, y));
        }

        public static double Correl1(List<double> x, List<double> y) {

            //https://stackoverflow.com/questions/17447817/correlation-of-two-arrays-in-c-sharp
            if (x.Count != y.Count)
                return (double.NaN); //throw new ArgumentException("values must be the same length");

            double sumX = 0;
            double sumX2 = 0;
            double sumY = 0;
            double sumY2 = 0;
            double sumXY = 0;

            int n = x.Count < y.Count ? x.Count : y.Count;

            for (int i = 0; i < n; ++i) {

                Double xval = x[i];
                Double yval = y[i];

                sumX += xval;
                sumX2 += xval * xval;
                sumY += yval;
                sumY2 += yval * yval;
                sumXY += xval * yval;
            }

            Double stdX = Math.Sqrt(sumX2 / n - sumX * sumX / n / n);
            Double stdY = Math.Sqrt(sumY2 / n - sumY * sumY / n / n);
            Double covariance = (sumXY / n - sumX * sumY / n / n);

            return covariance / stdX / stdY;
        }

        public static double Correl2(List<double> x, List<double> y) {

            double[] array_xy = new double[x.Count];
            double[] array_xp2 = new double[x.Count];
            double[] array_yp2 = new double[x.Count];

            for (int i = 0; i < x.Count; i++)
                array_xy[i] = x[i] * y[i];
            for (int i = 0; i < x.Count; i++)
                array_xp2[i] = Math.Pow(x[i], 2.0);
            for (int i = 0; i < x.Count; i++)
                array_yp2[i] = Math.Pow(y[i], 2.0);
            double sum_x = 0;
            double sum_y = 0;
            foreach (double n in x)
                sum_x += n;
            foreach (double n in y)
                sum_y += n;
            double sum_xy = 0;
            foreach (double n in array_xy)
                sum_xy += n;
            double sum_xpow2 = 0;
            foreach (double n in array_xp2)
                sum_xpow2 += n;
            double sum_ypow2 = 0;
            foreach (double n in array_yp2)
                sum_ypow2 += n;
            double Ex2 = Math.Pow(sum_x, 2.00);
            double Ey2 = Math.Pow(sum_y, 2.00);

            double Correl = 
            (x.Count * sum_xy - sum_x * sum_y) /
            Math.Sqrt((x.Count * sum_xpow2 - Ex2) * (x.Count * sum_ypow2 - Ey2));

            return (Correl);
        }
    }
}

这可能对您也有帮助,这是相关系数的代码。有一个库,其中有“CORREL”函数。它非常易于使用,并提供与excel相同的结果。它返回一个结果数组,而不是像Excel那样返回单个值。+1感谢您提供代码示例,并阐明了库的工作原理!问题是它只适用于int数组,而不是double数组。当然不是你的错,但我不能标记为已回答。是的,我没有看到参数的类型是
int
。如果您需要使用double,那么您可能需要为它编写自己的扩展方法。如果您查看该类的,您将看到它使用矩阵计算相关系数,因此您可能可以模拟它。感谢您的努力,我非常感谢!我也在考虑自定义代码和excelapi,但对于这样一个普通的任务来说,似乎工作量太大了:)我很高兴您发现我的示例很有帮助!Excel API有点粗糙,但它可以工作。感谢链接!这实际上可能是迄今为止最好的库,方法的使用真的再简单不过了:-)作为更新,Math.NET Numerics的3.5版在其相关类中添加了一个方法来计算加权Pearson相关。这是Contango中的代码,翻译成VB.NET。它给出的结果与Excel的Correl函数相同错误的语言。您好Dmitry,请告诉我数组中的所有值是否相同,函数返回一个NaN,我是否必须检查它们是否等于返回1或NaN始终表示1?Tks!示例@Tico Fortes:如果数组中的所有值都相同,那么实际上只有一个点,没有任何变化;如果有
  public static Double Correlation(Double[] Xs, Double[] Ys) {
    Double sumX = 0;
    Double sumX2 = 0;
    Double sumY = 0;
    Double sumY2 = 0;
    Double sumXY = 0;

    int n = Xs.Length < Ys.Length ? Xs.Length : Ys.Length;

    for (int i = 0; i < n; ++i) {
      Double x = Xs[i];
      Double y = Ys[i];

      sumX += x;
      sumX2 += x * x;
      sumY += y;
      sumY2 += y * y;
      sumXY += x * y;
    }

    Double stdX = Math.Sqrt(sumX2 / n - sumX * sumX / n / n);
    Double stdY = Math.Sqrt(sumY2 / n - sumY * sumY / n / n);
    Double covariance = (sumXY / n - sumX * sumY / n / n);

    return covariance / stdX / stdY; 
  }
using MathNet.Numerics.Statistics;

...

correlation = Correlation.Pearson(arrayOfValues1, arrayOfValues2);
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace TestCorrel {
    class Program {

        static void Main(string[] args) {

            Random rand = new Random(DateTime.Now.Millisecond);

            List<double> x = new List<double>();
            List<double> y = new List<double>();

            for (int i = 0; i < 100; i++) {

                x.Add(rand.Next(1000) * rand.NextDouble());
                y.Add(rand.Next(1000) * rand.NextDouble());

                Console.WriteLine(x[i] + "," + y[i]);
            }

            Console.WriteLine("Correl1: " + Correl1(x, y));
            Console.WriteLine("Correl2: " + Correl2(x, y));
        }

        public static double Correl1(List<double> x, List<double> y) {

            //https://stackoverflow.com/questions/17447817/correlation-of-two-arrays-in-c-sharp
            if (x.Count != y.Count)
                return (double.NaN); //throw new ArgumentException("values must be the same length");

            double sumX = 0;
            double sumX2 = 0;
            double sumY = 0;
            double sumY2 = 0;
            double sumXY = 0;

            int n = x.Count < y.Count ? x.Count : y.Count;

            for (int i = 0; i < n; ++i) {

                Double xval = x[i];
                Double yval = y[i];

                sumX += xval;
                sumX2 += xval * xval;
                sumY += yval;
                sumY2 += yval * yval;
                sumXY += xval * yval;
            }

            Double stdX = Math.Sqrt(sumX2 / n - sumX * sumX / n / n);
            Double stdY = Math.Sqrt(sumY2 / n - sumY * sumY / n / n);
            Double covariance = (sumXY / n - sumX * sumY / n / n);

            return covariance / stdX / stdY;
        }

        public static double Correl2(List<double> x, List<double> y) {

            double[] array_xy = new double[x.Count];
            double[] array_xp2 = new double[x.Count];
            double[] array_yp2 = new double[x.Count];

            for (int i = 0; i < x.Count; i++)
                array_xy[i] = x[i] * y[i];
            for (int i = 0; i < x.Count; i++)
                array_xp2[i] = Math.Pow(x[i], 2.0);
            for (int i = 0; i < x.Count; i++)
                array_yp2[i] = Math.Pow(y[i], 2.0);
            double sum_x = 0;
            double sum_y = 0;
            foreach (double n in x)
                sum_x += n;
            foreach (double n in y)
                sum_y += n;
            double sum_xy = 0;
            foreach (double n in array_xy)
                sum_xy += n;
            double sum_xpow2 = 0;
            foreach (double n in array_xp2)
                sum_xpow2 += n;
            double sum_ypow2 = 0;
            foreach (double n in array_yp2)
                sum_ypow2 += n;
            double Ex2 = Math.Pow(sum_x, 2.00);
            double Ey2 = Math.Pow(sum_y, 2.00);

            double Correl = 
            (x.Count * sum_xy - sum_x * sum_y) /
            Math.Sqrt((x.Count * sum_xpow2 - Ex2) * (x.Count * sum_ypow2 - Ey2));

            return (Correl);
        }
    }
}
287.688269702572,225.610842817282
618.9313498167,177.955550192835
25.7778882802361,27.6549569366756
140.847984766051,714.618547504125
438.618761728806,533.48764902702
481.347431274758,214.381256273194
21.6406916848573,393.559209519792
135.30397563209,158.419851317732
334.314685154853,814.275162949821
764.614904770914,50.1435267264692
42.8179292282173,47.8631582287434
237.216836650491,370.488416981179
388.849658539449,134.961087643151
305.903013161804,441.926902444068
10.6625048679591,369.567569480076
36.9316453891488,24.8947204607049
2.10067253471383,491.941975629861
7.94887068492774,573.037801189831
341.738006353722,653.497146697015
98.8424873439793,475.215988045193
272.248712629196,36.1088809138671
122.336823399801,169.158256422336
9.32281673202422,631.076001565473
201.118425176068,803.724831627554
415.514343714115,64.248651454341
227.791637123,230.512133914284
25.3438658925443,396.854282886188
596.238994411304,72.543763144195
230.239735877253,933.983901697669
796.060099040186,689.952468971234
9.30882684202344,269.22063744125
16.5005430148451,8.96549091859045
536.324005148524,358.829873788557
519.694526420764,17.3212184707267
552.628357889423,12.5541588051962
210.516099897454,388.57537739937
141.341571405689,268.082028986924
503.880356335491,753.447006912645
515.494990213539,444.451280259737
973.8670776076,168.922799013985
85.7111146094795,36.3784999169309
37.2147129193017,108.040356312432
504.590177939548,50.3934166889607
482.821039277511,888.984586256083
5.52549206350255,156.717087003271
405.833169031345,394.099059180868
459.249365587835,11.68776424494
429.421127440604,314.216759666901
126.908422469584,331.907062556551
62.1416232716952,3.19765723645578
4.16058817699579,604.04046284223
484.262182311277,220.177370167886
58.6774453314382,339.09660232677
463.482149892246,199.181594849183
344.128297473829,268.531428258182
0.883430369609702,209.346384477963
77.9462970131758,255.221325168955
583.629439312792,235.557751925922
358.409186083083,376.046612200349
81.2148325150902,10.7696774717279
53.7315618049966,274.171515094196
111.284646992239,130.174321939319
317.280491961763,338.077288461885
177.454564264722,7.53587801919127
69.2239431670047,233.693477620228
823.419546454875,0.111916855029723
23.7174749401014,200.989081544331
44.9598299125022,102.633862571155
74.1602278468945,292.485449988155
130.11182449251,23.4682153367755
243.088760058903,335.807090202722
13.3974915991526,436.983231269281
73.3900805168739,252.352352472186
592.144630201228,92.3395205570103
57.7306153447044,47.1416798900541
522.649018382024,584.427794722108
15.3662010204821,60.1693953262499
16.8335716728277,851.401980430541
33.9869734449251,0.930781653584345
116.66608504982,146.126050951949
92.8896130355492,711.765618208687
317.91980889529,322.186540377413
44.8574470732629,209.275617858058
751.201537871362,37.935519233316
161.817758424588,2.83156183493862
531.64078452142,79.1750782491523
114.803219681048,283.106988439852
123.472725123853,154.125248027558
89.9276725453919,63.4626924192825
105.623296753328,111.234188702067
435.72981759707,23.7058234576629
259.324810619152,69.3535200857341
719.885234421531,381.086239833891
24.2674900099018,198.408173349876
57.7761600361095,146.52277489124
77.4594609157459,710.746080866431
636.671781979814,538.894185951396
56.6035279932448,58.2563265684323
485.16099039333,427.849954283261
91.9552873247095,576.92944263617
Public Function Correlation(ByRef array1() As Double, ByRef array2() As Double) As Double
    'siehe https://stackoverflow.com/questions/17447817/correlation-of-two-arrays-in-c-sharp

    'der hier errechnete "Pearson correlation coefficient" muss noch quadriert werden, um R-Squared zu erhalten, siehe
    'https://en.wikipedia.org/wiki/Coefficient_of_determination


    Dim array_xy(array1.Length - 1) As Double
    Dim array_xp2(array1.Length - 1) As Double
    Dim array_yp2(array1.Length - 1) As Double

    Dim i As Integer
    For i = 0 To array1.Length - 1
        array_xy(i) = array1(i) * array2(i)
    Next i
    For i = 0 To array1.Length - 1
        array_xp2(i) = Math.Pow(array1(i), 2.0)
    Next i
    For i = 0 To array1.Length - 1
        array_yp2(i) = Math.Pow(array2(i), 2.0)
    Next i


    Dim sum_x As Double = 0
    Dim sum_y As Double = 0
    Dim EinDouble As Double

    For Each EinDouble In array1
        sum_x += EinDouble
    Next
    For Each EinDouble In array2
        sum_y += EinDouble
    Next

    Dim sum_xy As Double = 0
    For Each EinDouble In array_xy
        sum_xy += EinDouble
    Next

    Dim sum_xpow2 As Double = 0
    For Each EinDouble In array_xp2
        sum_xpow2 += EinDouble
    Next

    Dim sum_ypow2 As Double = 0
    For Each EinDouble In array_yp2
        sum_ypow2 += EinDouble
    Next

    Dim Ex2 As Double = Math.Pow(sum_x, 2.0)
    Dim Ey2 As Double = Math.Pow(sum_y, 2.0)

    Dim ReturnWert As Double
    ReturnWert = (array1.Length * sum_xy - sum_x * sum_y) / Math.Sqrt((array1.Length * sum_xpow2 - Ex2) * (array1.Length * sum_ypow2 - Ey2))
    Correlation = ReturnWert
End Function