C# 预加载下一个IEnumerable<；T>；价值_C#_.net_Performance_Parallel Processing_Task Parallel Library

C# 预加载下一个IEnumerable<；T>；价值

c# .net performance parallel-processing

C# 预加载下一个IEnumerable<；T>；价值,c#,.net,performance,parallel-processing,task-parallel-library,C#,.net,Performance,Parallel Processing,Task Parallel Library,鉴于班级结构： public class Foo { public IEnumerable<Bar> GetBars() { for(int i = 0; i < 1000; i++) { Thread.Sleep(1000); yield return new Bar() { Name = i.ToString() }; } } } public clas

鉴于班级结构：

public class Foo
{
    public IEnumerable<Bar> GetBars()
    {
        for(int i = 0; i < 1000; i++)
        {
            Thread.Sleep(1000);
            yield return new Bar() { Name = i.ToString() };
        }
    }
}

public class Bar
{
    public string Name { get; set; }
}

但是由于延迟，我想继续为每个

Foo

预加载下一个

Bar

值，然后让每个

Foo

的

IEnumable

按可访问顺序相互合并

我一直在研究Tpl Dataflow async nuget库（特别是

TransformBlock

，在较小程度上是

ActionBlock

），但找不到任何东西可以帮助我完成我正在尝试做的事情。

我建议查看。它基本上允许您在“push”类型集合（

IObservable

）上使用LINQ，而不是在“pull”类型集合（

IEnumerable

）上使用LINQ。换句话说，当集合中的新项目可用时，您的代码可以对它们作出反应。

问题是，无论并行与否，在获得第一个对象之前，您甚至无法开始获取第二个

条对象。只有通过LINQ功能对每个对象执行长时间运行的处理时，使用PLINQ才真正有帮助，而不是延迟是由于底层的IEnumerable
造成的
myFoo.Select(x=>x.GetBars()).Flatten().Select(bar => bar.Name)

一个选项是返回一系列Task
对象，这样移动迭代器只需很少的时间：
public async Task<Bar> GenerateFoo()
{
    await Task.Delay(1000);
    return new Bar() { Name = i.ToString() };
}

public IEnumerable<Task<Bar>> GetBars()
{
    for(int i = 0; i < 1000; i++)
    {
        yield return GenerateFoo();
    }
}

public异步任务GenerateFoo（）
{
等待任务。延迟（1000）；
返回新的Bar（）{Name=i.ToString（）}；
}
公共IEnumerable GetBars（）
{
对于（int i=0；i<1000；i++）
{
收益率-收益率生成器fo（）；
}
}

使用该代码意味着仅仅移动迭代器只会开始生成条
，而不是等待它完成。一旦有了这些，您可以向每个任务添加continuations来处理每个条形图的处理，也可以使用task.WaitAll
或task.whalll
等方法来等待它们全部完成。
您可以编写一个扩展方法，如下所示，该方法将生成条形图（在任何可枚举项中）一旦他们有空
myFoo.Select(x=>x.GetBars()).Flatten().Select(bar => bar.Name)


公共静态类并行扩展
{
公共静态IEnumerable展平（此IEnumerable enumOfEnums）
{
BlockingCollection队列=新建BlockingCollection（）；
Task.Factory.StartNew（（）=>
{
Parallel.ForEach（enumOfEnums，e=>
{
foreach（变量x在e中）
{
添加（x）；
}
});
queue.CompleteAdding（）；
});
return queue.getconsumineGenumerable（）；
}
}
我最终编写了一个新的IEnumerable
实现，它执行以下操作：
public IEnumerator<T> GetEnumerator()
{
    TaskFactory<T> taskFactory = new TaskFactory<T>();
    Task<T> task = null;
    IEnumerator<T> enumerator = Source.GetEnumerator();

    T result = null;
    do
    {
        if (task != null)
        {
            result = task.Result;
            if (result == null)
                break;
        }

        task = taskFactory.StartNew(() =>
        {
            if (enumerator.MoveNext())
                return enumerator.Current;
            else
                return null;
        });
        if (result != null)
            yield return result;
    }
    while (task != null);
}

public IEnumerator GetEnumerator（）
{
TaskFactory TaskFactory=新建TaskFactory（）；
Task=null；
IEnumerator枚举器=Source.GetEnumerator（）；
T结果=null；
做
{
如果（任务！=null）
{
结果=任务。结果；
如果（结果==null）
打破
}
task=taskFactory.StartNew（（）=>
{
if（枚举数.MoveNext（））
返回枚举数.Current；
其他的
返回null；
});
如果（结果！=null）
收益结果；
}
while（task！=null）；
}

它只需在返回第一个结果之前请求前两个结果，然后始终在已产生的结果之前保留一个结果请求。
但这不会同时开始枚举所有集合，因为Parallel.ForEach（）
使用分区。@svick是的，这取决于CPU的大小。我本可以用类似于创建许多任务的东西，但答案会更复杂。你能更详细地解释一下你到底是如何在这里使用Rx的吗？这实际上并不预取任何东西，因为只有在yield return
@svick I copy paste failed.之后继续执行此方法后，你才能启动任务。。现在修好了。
public IEnumerator<T> GetEnumerator()
{
    TaskFactory<T> taskFactory = new TaskFactory<T>();
    Task<T> task = null;
    IEnumerator<T> enumerator = Source.GetEnumerator();

    T result = null;
    do
    {
        if (task != null)
        {
            result = task.Result;
            if (result == null)
                break;
        }

        task = taskFactory.StartNew(() =>
        {
            if (enumerator.MoveNext())
                return enumerator.Current;
            else
                return null;
        });
        if (result != null)
            yield return result;
    }
    while (task != null);
}