C# 将列表细分为多个列表时System.Linq的性能问题_C#_Linq_Iteration

C# 将列表细分为多个列表时System.Linq的性能问题

c# linq

C# 将列表细分为多个列表时System.Linq的性能问题,c#,linq,iteration,C#,Linq,Iteration,我编写了一个方法，使用System.Linq将项目列表细分为多个列表。当我对50000个简单整数运行此方法时，大约需要59.862秒 Stopwatch watchresult0 = new Stopwatch(); watchresult0.Start(); var result0 = SubDivideListLinq(Enumerable.Range(0, 50000), 100).ToList(); watchresult0.Stop(); long elapsedresult0 =

我编写了一个方法，使用

System.Linq

将项目列表细分为多个列表。当我对50000个简单整数运行此方法时，大约需要59.862秒

Stopwatch watchresult0 = new Stopwatch(); watchresult0.Start(); var result0 = SubDivideListLinq(Enumerable.Range(0, 50000), 100).ToList(); watchresult0.Stop(); long elapsedresult0 = watchresult0.ElapsedMilliseconds;
所以我试着增强它，并用一个简单的循环来迭代列表中的每个项目，只需要4毫秒：

Stopwatch watchresult1 = new Stopwatch(); watchresult1.Start(); var result1 = SubDivideList(Enumerable.Range(0, 50000), 100).ToList(); watchresult1.Stop(); long elapsedresult1 = watchresult1.ElapsedMilliseconds;
这是我使用Linq的细分方法：

private static IEnumerable<List<T>> SubDivideListLinq<T>(IEnumerable<T> enumerable, int count) { while (enumerable.Any()) { yield return enumerable.Take(count).ToList(); enumerable = enumerable.Skip(count); } }
你知道为什么我自己的实现比用Linq划分要快得多吗？还是我做错了什么

而且：正如你所看到的，我知道如何分割列表，所以这是相关问题的重复。我想知道linq和我的实现之间的性能。如果有人带着同样的问题来到这里，不知道如何拆分列表：
所以最后我做了更多的研究，发现System.Linq的多重枚举是性能的原因：
当我将其枚举到数组中时，为了避免多次枚举，性能会更好（14 ms/50k项）：
不过，我不会使用linq方法，因为它比较慢。相反，我编写了一个扩展方法来细分我的列表，50k个项目需要3毫秒：

T[] allItems = enumerable as T[] ?? enumerable.ToArray(); while (allItems.Any()) { yield return allItems.Take(count); allItems = allItems.Skip(count).ToArray(); }

public static class EnumerableExtensions { public static IEnumerable<List<T>> Subdivide<T>(this IEnumerable<T> enumerable, int count) { List<T> items = new List<T>(count); int index = 0; foreach (T item in enumerable) { items.Add(item); index++; if (index != count) continue; yield return items; items = new List<T>(count); index = 0; } if (index != 0 && items.Any()) yield return items; } }

公共静态类EnumerableExtensions { 公共静态IEnumerable细分（此IEnumerable可枚举，int计数） { 列表项=新列表（计数）； int指数=0； foreach（可枚举中的T项） { 项目。添加（项目）；索引++；如果（索引！=计数）继续；收益回报项目；项目=新列表（计数）；指数=0； } if（index！=0&&items.Any（））收益回报项目； } }

与@Andreasniedermir已经编写的一样，它也包含在
MoreLinq
-库中，称为
Batch
。（但我现在不会仅为这一种方法添加库）
如果您想要提高可读性和性能，您可能希望使用此算法。就速度而言，这一款非常接近你的非linq版本。同时，它更具可读性

private static IEnumerable<List<T>> SubDivideListLinq<T>(IEnumerable<T> enumerable, int count) { int index = 0; return enumerable.GroupBy(l => index++/count).Select(l => l.ToList()); }

private静态IEnumerable subdivisidelistlinq（IEnumerable enumerable，int count） { int指数=0；返回enumerable.GroupBy（l=>index++/count）； }
及其替代方案：

private static IEnumerable<List<T>> SubDivideListLinq<T>(IEnumerable<T> enumerable, int count) { int index = 0; return from l in enumerable group l by index++/count into l select l.ToList(); }

private static IEnumerable<List<T>> SubDivideListLinq<T>(IEnumerable<T> enumerable, int count) { int index = 0; return enumerable.GroupBy(l => index++/count, item => item, (key,result) => result.ToList()); }

private静态IEnumerable subdivisidelistlinq（IEnumerable enumerable，int count） { int指数=0；从可枚举中的l返回按索引++/计数的l组选择l.ToList（）； }
另一种选择：

private static IEnumerable<List<T>> SubDivideListLinq<T>(IEnumerable<T> enumerable, int count) { int index = 0; return from l in enumerable group l by index++/count into l select l.ToList(); }

private static IEnumerable<List<T>> SubDivideListLinq<T>(IEnumerable<T> enumerable, int count) { int index = 0; return enumerable.GroupBy(l => index++/count, item => item, (key,result) => result.ToList()); }

private静态IEnumerable subdivisidelistlinq（IEnumerable enumerable，int count） { int指数=0；返回可枚举的.GroupBy（l=>index++/count，项目=>项目，（key，result）=>result.ToList（））； }
在我的计算机中，我得到了
linq 0.006秒
与
非linq 0.002秒
，这是完全公平的，可以接受使用linq

作为建议，不要用微优化代码折磨自己。显然，没有人会感觉到几毫秒的差异，所以请编写一段代码，稍后您和其他人可以轻松理解。
LINQ版本将在每次迭代中执行两次查询。它也将有空循环，总是要找到最后的位置，而您的优化方法可以继续处理。它还将使用正确的大小初始化列表，而LINQ每次都必须调整内部数组的大小。另外，不要使用
while（enumerable.Any（））
。使用某些迭代器时，可能会丢失值。您必须安全地获取所有值，如使用
foreach
或使用
MoveNext
和
Current
是的，我不认为这是重复的，因为“重复”帖子不包括性能。事实上，低性能的LINQ代码来自（不知什么原因）高投票率的答案。如果你不想从重复中学习，那就是你的问题。我看不出将一半的线程复制/粘贴到此线程有什么价值。在你做对了之后，你会担心你的表现。只要你做错了，性能就没有什么价值。我曾经写过一个完美的打包算法。它是超快速和超压缩。但是我没能把它解压缩。哎呀。先把它弄好，然后再快，因为只要有错误，快速就一文不值。仅供参考，你是完全正确的，我喜欢使用linq。差0.004秒是可以的。1分钟的差异是没有的，这就是我开始提问的原因：）这看起来更具可读性，也很好。有了你上面的评论和你的回复，我将接受你的回复并使用它。现在你们都毁了原来的帖子。我参与了以非重复方式重新打开它，因为它涉及到具体的实现性能。这个答案和accept将其放回duplicate，其中包含许多关于如何执行拆分的答案，包括与此类似（或完全相同）的答案。如果你认为这个答案给主题增加了一些东西，那就把它贴在被确认为重复的问题下。它肯定不能回答这个帖子的问题。