C# 在列表框中计算重复项

C# 在列表框中计算重复项,c#,.net,listbox,duplicates,C#,.net,Listbox,Duplicates,我正在尝试用C语言开发一个简单的应用程序,以计算列表框中的重复数。我需要计算所有重复的数量,并为重复最多的前3个元素显示排名后缀。例如,假设一个列表有7个元素称为“apple”,6个元素称为“pear”,4个元素称为“peach”,3个元素称为“orange”,在处理之后,它应该将列表显示为: apple (7) pear (6) peach (4) orange 苹果(7) 梨(6) 桃子(4) 橙色 由于我们不知道您正在使用的数据源,下面是一个通用的LINQ示例,可以帮助您开始 string

我正在尝试用C语言开发一个简单的应用程序,以计算列表框中的重复数。我需要计算所有重复的数量,并为重复最多的前3个元素显示排名后缀。例如,假设一个列表有7个元素称为“apple”,6个元素称为“pear”,4个元素称为“peach”,3个元素称为“orange”,在处理之后,它应该将列表显示为:

apple (7) pear (6) peach (4) orange 苹果(7) 梨(6) 桃子(4) 橙色
由于我们不知道您正在使用的数据源,下面是一个通用的LINQ示例,可以帮助您开始

string[] items = { "apple", "pear", "peach", "apple", "orange", "peach", "apple" };

var ranking = (from item in items
               group item by item into r
               orderby r.Count() descending
               select new { name = r.Key, rank = r.Count() }).Take(3);
这将返回包含前3项的
名称和排名的对象集合

当然,您可以将这里的
items
数组替换为用于填充列表框的每个数据源,如果这些项不仅仅是简单的字符串,而是更复杂的项,那么您可以适当地调整LINQ查询

下面是上面的一个例子,它将用您显示的表单中的数据填充一个列表框

  string[] items = { "apple", "pear", "peach", "apple", "orange", "peach", "apple" };

  var ranking = (from item in items
                 group item by item into r
                 orderby r.Count() descending
                 select new { name = r.Key, rank = r.Count() }).ToArray();

  for (int i = 0; i < ranking.Length; ++i)
  {
    var item = ranking[i];
    if (i < 3)
    {
      listBox1.Items.Add(string.Format("{0} ({1})", item.name, item.rank));
    }
    else
    {
      listBox1.Items.Add(item.name);
    }
  }
string[]项={“苹果”、“梨”、“桃”、“苹果”、“橘子”、“桃”、“苹果”};
var排名=(来自项目中的项目
逐项分组到r中
orderby r.Count()递减
选择新的{name=r.Key,rank=r.Count()});
for(int i=0;i

这与第一个示例相同,但将结果转换为数组,并用前3个项目显示排名的项目填充列表框。

这里是使用Linq的另一种方法,作为一种定时测试,以查看哪个执行速度更快。以下是我通过1000次迭代获得的结果:

Total words: 1324
Min        Max        Mean       Method
5305       22889      5739.182   LinkMethodToArray
5053       11973      5418.355   LinkMethod
3112       6726       3252.457   HashMethod
在这种情况下,LinkMethod的速度仅为1.6倍。没有我测试过的很多Linq代码那么糟糕,但它只有1324个字

编辑#1

那是在添加排序之前。通过排序,您可以看到它可以与Linq方法进行比较。当然,将散列复制到列表中,然后对列表进行排序并不是最有效的方法。我们可以在这方面改进。我想到了几种方法,但没有一种是简单的,需要编写大量自定义代码

因为我们想要使用已经可用的东西,并且我们想要代码清晰,所以我不得不说Linq实际上是一个非常好的选择。这改变了我对林克的看法。。一点。我看到过太多其他的比较,Linq的速度慢了很多(大约慢了1000倍),这为在任何地方使用Linq开了绿灯,但肯定在这一点上,它表现得非常好

我想这个寓意是,一如既往,考验,考验,考验

以下是添加到HashMethod的排序值

Total words: 1324
Min        Max        Mean       Method
5284       21030      5667.808   LinkMethodToArray
5081       36339      5425.626   LinkMethod
5017       27583      5288.602   HashMethod
编辑#2

几个简单的优化(预先初始化字典和列表)使HashMethod的速度明显加快

Total words: 1324
Min        Max        Mean       Method
5287       16299      5686.429   LinkMethodToArray
5081       21813      5440.758   LinkMethod
4588       8420       4710.659   HashMethod
编辑#3

随着词组的增大,它们变得更加均匀。事实上,Linq方法似乎每次都会被淘汰。这是美国宪法(全部七条条款和签名)。这可能是因为声明中有很多重复的词(“他有……”)

代码:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading;

class Program
{
    static void Main()
    {
        Thread.CurrentThread.Priority = ThreadPriority.Highest;

        // Declaration.txt is a copy of the Declaration of Independence
        // which can be found here: http://en.wikisource.org/wiki/United_States_Declaration_of_Independence
        string declaration = File.ReadAllText("Declaration.txt");
        string[] items = declaration.ToLower().Split(new char[] { ',', '.', ':', ';', '-', '\r', '\n', '\t', ' ' }, StringSplitOptions.RemoveEmptyEntries);

        // pre-execute outside timing loop
        LinqMethodToArray(items);
        LinqMethod(items);
        HashMethod(items);

        int iterations = 1000;
        long min1 = long.MaxValue, max1 = long.MinValue, sum1 = 0;
        long min2 = long.MaxValue, max2 = long.MinValue, sum2 = 0;
        long min3 = long.MaxValue, max3 = long.MinValue, sum3 = 0;

        Console.WriteLine("Iterations: {0}", iterations);
        Console.WriteLine("Total words: {0}", items.Length);

        Stopwatch sw = new Stopwatch();

        for (int n = 0; n < iterations; n++)
        {
            sw.Reset();
            sw.Start();
            LinqMethodToArray(items);
            sw.Stop();
            sum1 += sw.ElapsedTicks;
            if (sw.ElapsedTicks < min1)
                min1 = sw.ElapsedTicks;
            if (sw.ElapsedTicks > max1)
                max1 = sw.ElapsedTicks;

            sw.Reset();
            sw.Start();
            LinqMethod(items);
            sw.Stop();
            sum2 += sw.ElapsedTicks;
            if (sw.ElapsedTicks < min2)
                min2 = sw.ElapsedTicks;
            if (sw.ElapsedTicks > max2)
                max2 = sw.ElapsedTicks;

            sw.Reset();
            sw.Start();
            HashMethod(items);
            sw.Stop();
            sum3 += sw.ElapsedTicks;
            if (sw.ElapsedTicks < min3)
                min3 = sw.ElapsedTicks;
            if (sw.ElapsedTicks > max3)
                max3 = sw.ElapsedTicks;
        }

        Console.WriteLine("{0,-10} {1,-10} {2,-10} Method", "Min", "Max", "Mean");
        Console.WriteLine("{0,-10} {1,-10} {2,-10} LinkMethodToArray", min1, max1, (double)sum1 / iterations);
        Console.WriteLine("{0,-10} {1,-10} {2,-10} LinkMethod", min2, max2, (double)sum2 / iterations);
        Console.WriteLine("{0,-10} {1,-10} {2,-10} HashMethod", min3, max3, (double)sum3 / iterations);
    }

    static void LinqMethodToArray(string[] items)
    {
        var ranking = (from item in items
                       group item by item into r
                       orderby r.Count() descending
                       select new { Name = r.Key, Rank = r.Count() }).ToArray();
        for (int n = 0; n < ranking.Length; n++)
        {
            var item = ranking[n];
            DoSomethingWithItem(item);
        }
    }

    static void LinqMethod(string[] items)
    {
        var ranking = (from item in items
                       group item by item into r
                       orderby r.Count() descending
                       select new { Name = r.Key, Rank = r.Count() });
        foreach (var item in ranking)
            DoSomethingWithItem(item);
    }

    static void HashMethod(string[] items)
    {
        var ranking = new Dictionary<string, int>(items.Length / 2);
        foreach (string item in items)
        {
            if (!ranking.ContainsKey(item))
                ranking[item] = 1;
            else
                ranking[item]++;
        }
        var list = new List<KeyValuePair<string, int>>(ranking);
        list.Sort((a, b) => b.Value.CompareTo(a.Value));
        foreach (KeyValuePair<string, int> pair in list)
            DoSomethingWithItem(pair);

    }

    static volatile object hold;
    static void DoSomethingWithItem(object item)
    {
        // This method exists solely to prevent the compiler from
        // optimizing use of the item away so that this program
        // can be executed in Release build, outside the debugger.
        hold = item;
    }
}
使用系统;
使用System.Collections.Generic;
使用系统诊断;
使用System.IO;
使用System.Linq;
使用系统线程;
班级计划
{
静态void Main()
{
Thread.CurrentThread.Priority=ThreadPriority.Highest;
//Declaration.txt是《独立宣言》的副本
//可在此处找到:http://en.wikisource.org/wiki/United_States_Declaration_of_Independence
字符串声明=File.ReadAllText(“declaration.txt”);
string[]items=declaration.ToLower().Split(新字符[]{',',':',';','-','\r','\n','\t','},StringSplitOptions.RemoveEmptyEntries);
//预执行外部定时循环
LinqMethodToArray(项目);
LINQ方法(项目);
方法(项目);
int迭代次数=1000次;
long min1=long.MaxValue,max1=long.MinValue,sum1=0;
long min2=long.MaxValue,max2=long.MinValue,sum2=0;
long min3=long.MaxValue,max3=long.MinValue,sum3=0;
WriteLine(“迭代:{0}”,迭代);
WriteLine(“总字数:{0}”,items.Length);
秒表sw=新秒表();
对于(int n=0;nmax1)
max1=sw.ElapsedTicks;
sw.Reset();
sw.Start();
LINQ方法(项目);
sw.Stop();
sum2+=sw.ElapsedTicks;
如果(西南ElapsedTicksmax2)
max2=sw.ElapsedTicks;
sw.Reset();
sw.Start();
方法(项目);
sw.Stop();
sum3+=sw.ElapsedTicks;
如果(西南ElapsedTicksmax3)
max3=sw.ElapsedTicks;
}
WriteLine(“{0,-10}{1,-10}{2,-10}方法”,“最小”,“最大”,“平均”);
WriteLine(“{0,-10}{1,-10}{2,-10}LinkMethodToArray”,min1,max1,(双)sum1/iterations);
WriteLine(“{0,-10}{1,-10}{2,-10}LinkMethod”,min2,max2,(double)sum2/iterations);
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Threading;

class Program
{
    static void Main()
    {
        Thread.CurrentThread.Priority = ThreadPriority.Highest;

        // Declaration.txt is a copy of the Declaration of Independence
        // which can be found here: http://en.wikisource.org/wiki/United_States_Declaration_of_Independence
        string declaration = File.ReadAllText("Declaration.txt");
        string[] items = declaration.ToLower().Split(new char[] { ',', '.', ':', ';', '-', '\r', '\n', '\t', ' ' }, StringSplitOptions.RemoveEmptyEntries);

        // pre-execute outside timing loop
        LinqMethodToArray(items);
        LinqMethod(items);
        HashMethod(items);

        int iterations = 1000;
        long min1 = long.MaxValue, max1 = long.MinValue, sum1 = 0;
        long min2 = long.MaxValue, max2 = long.MinValue, sum2 = 0;
        long min3 = long.MaxValue, max3 = long.MinValue, sum3 = 0;

        Console.WriteLine("Iterations: {0}", iterations);
        Console.WriteLine("Total words: {0}", items.Length);

        Stopwatch sw = new Stopwatch();

        for (int n = 0; n < iterations; n++)
        {
            sw.Reset();
            sw.Start();
            LinqMethodToArray(items);
            sw.Stop();
            sum1 += sw.ElapsedTicks;
            if (sw.ElapsedTicks < min1)
                min1 = sw.ElapsedTicks;
            if (sw.ElapsedTicks > max1)
                max1 = sw.ElapsedTicks;

            sw.Reset();
            sw.Start();
            LinqMethod(items);
            sw.Stop();
            sum2 += sw.ElapsedTicks;
            if (sw.ElapsedTicks < min2)
                min2 = sw.ElapsedTicks;
            if (sw.ElapsedTicks > max2)
                max2 = sw.ElapsedTicks;

            sw.Reset();
            sw.Start();
            HashMethod(items);
            sw.Stop();
            sum3 += sw.ElapsedTicks;
            if (sw.ElapsedTicks < min3)
                min3 = sw.ElapsedTicks;
            if (sw.ElapsedTicks > max3)
                max3 = sw.ElapsedTicks;
        }

        Console.WriteLine("{0,-10} {1,-10} {2,-10} Method", "Min", "Max", "Mean");
        Console.WriteLine("{0,-10} {1,-10} {2,-10} LinkMethodToArray", min1, max1, (double)sum1 / iterations);
        Console.WriteLine("{0,-10} {1,-10} {2,-10} LinkMethod", min2, max2, (double)sum2 / iterations);
        Console.WriteLine("{0,-10} {1,-10} {2,-10} HashMethod", min3, max3, (double)sum3 / iterations);
    }

    static void LinqMethodToArray(string[] items)
    {
        var ranking = (from item in items
                       group item by item into r
                       orderby r.Count() descending
                       select new { Name = r.Key, Rank = r.Count() }).ToArray();
        for (int n = 0; n < ranking.Length; n++)
        {
            var item = ranking[n];
            DoSomethingWithItem(item);
        }
    }

    static void LinqMethod(string[] items)
    {
        var ranking = (from item in items
                       group item by item into r
                       orderby r.Count() descending
                       select new { Name = r.Key, Rank = r.Count() });
        foreach (var item in ranking)
            DoSomethingWithItem(item);
    }

    static void HashMethod(string[] items)
    {
        var ranking = new Dictionary<string, int>(items.Length / 2);
        foreach (string item in items)
        {
            if (!ranking.ContainsKey(item))
                ranking[item] = 1;
            else
                ranking[item]++;
        }
        var list = new List<KeyValuePair<string, int>>(ranking);
        list.Sort((a, b) => b.Value.CompareTo(a.Value));
        foreach (KeyValuePair<string, int> pair in list)
            DoSomethingWithItem(pair);

    }

    static volatile object hold;
    static void DoSomethingWithItem(object item)
    {
        // This method exists solely to prevent the compiler from
        // optimizing use of the item away so that this program
        // can be executed in Release build, outside the debugger.
        hold = item;
    }
}