C# 如何限制并发异步I/O操作的数量?
问题是,它同时启动1000多个web请求。有没有一种简单的方法来限制这些异步http请求的并发量?因此,在任何给定时间下载的网页不超过20个。如何以最有效的方式执行此任务?尽管1000个任务可能会很快排队,但并行任务库只能处理与机器中CPU内核数量相等的并发任务。这意味着,如果您有一台四核机器,在给定的时间内只执行4个任务(除非您降低MaxDegreeOfParallelism)。并行计算应用于加速CPU限制的操作。这里我们讨论的是I/O绑定操作。您的实现应该是,除非您的多核CPU上的单核过于繁忙C# 如何限制并发异步I/O操作的数量?,c#,asynchronous,task-parallel-library,async-ctp,async-await,C#,Asynchronous,Task Parallel Library,Async Ctp,Async Await,问题是,它同时启动1000多个web请求。有没有一种简单的方法来限制这些异步http请求的并发量?因此,在任何给定时间下载的网页不超过20个。如何以最有效的方式执行此任务?尽管1000个任务可能会很快排队,但并行任务库只能处理与机器中CPU内核数量相等的并发任务。这意味着,如果您有一台四核机器,在给定的时间内只执行4个任务(除非您降低MaxDegreeOfParallelism)。并行计算应用于加速CPU限制的操作。这里我们讨论的是I/O绑定操作。您的实现应该是,除非您的多核CPU上的单核过于繁
编辑我喜欢usr提出的在此处使用“异步信号量”的建议。使用
MaxDegreeOfParallelism
,这是您可以在中指定的选项:
不幸的是,.NET Framework缺少用于编排并行异步任务的最重要的组合器。没有这种东西是内置的
看看最受尊敬的斯蒂芬·图布(Stephen Toub)打造的班级。你需要的是信号量,你需要一个异步版本的信号量。你可以在最新版本的async for.NET中使用.NET 4.5 Beta版来实现这一点。“usr”的前一篇文章提到了Stephen Toub写的一篇好文章,但不太公开的消息是异步信号量实际上已经进入了.NET 4.5的Beta版 如果您看看我们喜爱的类(您应该使用它,因为它比原来的性能更好),它现在拥有一系列重载,以及所有预期参数-超时间隔、取消令牌、所有常见的调度朋友:) Stephen最近还写了一篇关于beta-see推出的新的.NET4.5 goodies的博客文章 最后,下面是一些关于如何使用信号量限制进行异步方法限制的示例代码:
var options = new ParallelOptions { MaxDegreeOfParallelism = 20 };
Parallel.ForEach(urls, options,
url =>
{
var client = new HttpClient();
var html = client.GetStringAsync(url);
// do stuff with html
});
公共异步任务方法()
{
//假设有1000多个URL的列表
var url={”http://google.com", "http://yahoo.com", ... };
//现在,让我们将HTTP请求并行发送到每个URL
var allTasks=new List();
var节流器=新信号量lim(初始计数:20);
foreach(url中的变量url)
{
//执行异步等待,直到我们可以再次安排
wait throttler.WaitAsync();
//使用Task.Run(…)以自己的并行方式运行lambda
//线程池上的流
所有任务。添加(
Task.Run(异步()=>
{
尝试
{
var client=新的HttpClient();
var html=await client.GetStringAsync(url);
}
最后
{
节流器释放();
}
}));
}
//在将所有URL放入任务之前,不会到达此处
等待任务。WhenAll(所有任务);
//在所有任务以某种方式完成之前不会到达此处
//(成功或例外)
}
最后,但可能值得一提的是使用基于第三方物流的调度的解决方案。您可以在TPL上创建尚未启动的委托绑定任务,并允许自定义任务调度程序限制并发性。事实上,这里有一个MSDN示例:
另请参见。基本上,您需要为要点击的每个URL创建一个操作或任务,将它们放在列表中,然后处理该列表,限制可以并行处理的数量
public async Task MyOuterMethod()
{
// let's say there is a list of 1000+ URLs
var urls = { "http://google.com", "http://yahoo.com", ... };
// now let's send HTTP requests to each of these URLs in parallel
var allTasks = new List<Task>();
var throttler = new SemaphoreSlim(initialCount: 20);
foreach (var url in urls)
{
// do an async wait until we can schedule again
await throttler.WaitAsync();
// using Task.Run(...) to run the lambda in its own parallel
// flow on the threadpool
allTasks.Add(
Task.Run(async () =>
{
try
{
var client = new HttpClient();
var html = await client.GetStringAsync(url);
}
finally
{
throttler.Release();
}
}));
}
// won't get here until all urls have been put into tasks
await Task.WhenAll(allTasks);
// won't get here until all tasks have completed in some way
// (either success or exception)
}
演示如何使用任务和操作执行此操作,并提供一个示例项目,您可以下载并运行该项目来查看这两个操作
用行动
如果使用操作,则可以使用内置的.Net Parallel.Invoke函数。这里我们限制它最多并行运行20个线程
public async Task MyOuterMethod()
{
// let's say there is a list of 1000+ URLs
var urls = { "http://google.com", "http://yahoo.com", ... };
// now let's send HTTP requests to each of these URLs in parallel
var allTasks = new List<Task>();
var throttler = new SemaphoreSlim(initialCount: 20);
foreach (var url in urls)
{
// do an async wait until we can schedule again
await throttler.WaitAsync();
// using Task.Run(...) to run the lambda in its own parallel
// flow on the threadpool
allTasks.Add(
Task.Run(async () =>
{
try
{
var client = new HttpClient();
var html = await client.GetStringAsync(url);
}
finally
{
throttler.Release();
}
}));
}
// won't get here until all urls have been put into tasks
await Task.WhenAll(allTasks);
// won't get here until all tasks have completed in some way
// (either success or exception)
}
var listOfActions=new List();
foreach(url中的变量url)
{
var localUrl=url;
//注意,我们在这里创建任务,但不启动它。
Add(新任务(()=>CallUrl(localUrl));
}
var options=new ParallelOptions{maxdegreeofpparallelism=20};
Parallel.Invoke(options,listOfActions.ToArray());
有任务
对于任务,没有内置函数。但是,你可以使用我在我的博客上提供的
var listOfActions = new List<Action>();
foreach (var url in urls)
{
var localUrl = url;
// Note that we create the Task here, but do not start it.
listOfTasks.Add(new Task(() => CallUrl(localUrl)));
}
var options = new ParallelOptions {MaxDegreeOfParallelism = 20};
Parallel.Invoke(options, listOfActions.ToArray());
//
///启动给定的任务并等待它们完成。这将最多并行运行指定数量的任务。
///注意:如果给定任务之一已启动,将引发异常。
///
///要运行的任务。
///并行运行的最大任务数。
///取消令牌。
公共静态异步任务StartAndWaitAllThrottledAsync(IEnumerable tasksToRun,int-MaxTaskStorUnparallel,CancellationToken CancellationToken=new CancellationToken())
{
wait start和wait allthrottledasync(tasksToRun,maxstaskstoruninparallel,-1,cancellationToken);
}
///
///启动给定的任务并等待它们完成。这将并行运行指定数量的任务。
///注意:如果在任务完成之前达到超时,则可能会启动另一个任务,可能会超过指定的最大允许运行时间。
///注意:如果给定任务之一已启动,将引发异常。
///
///要运行的任务。
///并行运行的最大任务数。
///在允许另一个任务启动之前,我们应该允许max任务并行运行的最长毫秒数。指定-1无限期等待。
///取消令牌。
公共静态异步任务StartAndWaitAllThrottledAsync(IEnumerable tasksToRun,int-MaxTaskStorUnparallel,int-TimeoutInMilleds,CancellationToken CancellationToken=new CancellationToken())
{
//皈依
/// <summary>
/// Starts the given tasks and waits for them to complete. This will run, at most, the specified number of tasks in parallel.
/// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
/// </summary>
/// <param name="tasksToRun">The tasks to run.</param>
/// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
/// <param name="cancellationToken">The cancellation token.</param>
public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, CancellationToken cancellationToken = new CancellationToken())
{
await StartAndWaitAllThrottledAsync(tasksToRun, maxTasksToRunInParallel, -1, cancellationToken);
}
/// <summary>
/// Starts the given tasks and waits for them to complete. This will run the specified number of tasks in parallel.
/// <para>NOTE: If a timeout is reached before the Task completes, another Task may be started, potentially running more than the specified maximum allowed.</para>
/// <para>NOTE: If one of the given tasks has already been started, an exception will be thrown.</para>
/// </summary>
/// <param name="tasksToRun">The tasks to run.</param>
/// <param name="maxTasksToRunInParallel">The maximum number of tasks to run in parallel.</param>
/// <param name="timeoutInMilliseconds">The maximum milliseconds we should allow the max tasks to run in parallel before allowing another task to start. Specify -1 to wait indefinitely.</param>
/// <param name="cancellationToken">The cancellation token.</param>
public static async Task StartAndWaitAllThrottledAsync(IEnumerable<Task> tasksToRun, int maxTasksToRunInParallel, int timeoutInMilliseconds, CancellationToken cancellationToken = new CancellationToken())
{
// Convert to a list of tasks so that we don't enumerate over it multiple times needlessly.
var tasks = tasksToRun.ToList();
using (var throttler = new SemaphoreSlim(maxTasksToRunInParallel))
{
var postTaskTasks = new List<Task>();
// Have each task notify the throttler when it completes so that it decrements the number of tasks currently running.
tasks.ForEach(t => postTaskTasks.Add(t.ContinueWith(tsk => throttler.Release())));
// Start running each task.
foreach (var task in tasks)
{
// Increment the number of tasks currently running and wait if too many are running.
await throttler.WaitAsync(timeoutInMilliseconds, cancellationToken);
cancellationToken.ThrowIfCancellationRequested();
task.Start();
}
// Wait for all of the provided tasks to complete.
// We wait on the list of "post" tasks instead of the original tasks, otherwise there is a potential race condition where the throttler's using block is exited before some Tasks have had their "post" action completed, which references the throttler, resulting in an exception due to accessing a disposed object.
await Task.WhenAll(postTaskTasks.ToArray());
}
}
var listOfTasks = new List<Task>();
foreach (var url in urls)
{
var localUrl = url;
// Note that we create the Task here, but do not start it.
listOfTasks.Add(new Task(async () => await CallUrl(localUrl)));
}
await Tasks.StartAndWaitAllThrottledAsync(listOfTasks, 20);
public static Task ForEachAsync<TIn>(
IEnumerable<TIn> inputEnumerable,
Func<TIn, Task> asyncProcessor,
int? maxDegreeOfParallelism = null)
{
int maxAsyncThreadCount = maxDegreeOfParallelism ?? DefaultMaxDegreeOfParallelism;
SemaphoreSlim throttler = new SemaphoreSlim(maxAsyncThreadCount, maxAsyncThreadCount);
IEnumerable<Task> tasks = inputEnumerable.Select(async input =>
{
await throttler.WaitAsync().ConfigureAwait(false);
try
{
await asyncProcessor(input).ConfigureAwait(false);
}
finally
{
throttler.Release();
}
});
return Task.WhenAll(tasks);
}
// let's say there is a list of 1000+ URLs
string[] urls = { "http://google.com", "http://yahoo.com", ... };
// now let's send HTTP requests to each of these URLs in parallel
await urls.ParallelForEachAsync(async (url) => {
var client = new HttpClient();
var html = await client.GetStringAsync(url);
}, maxDegreeOfParalellism: 20);
/// <summary>
/// Concurrently Executes async actions for each item of <see cref="IEnumerable<typeparamref name="T"/>
/// </summary>
/// <typeparam name="T">Type of IEnumerable</typeparam>
/// <param name="enumerable">instance of <see cref="IEnumerable<typeparamref name="T"/>"/></param>
/// <param name="action">an async <see cref="Action" /> to execute</param>
/// <param name="maxActionsToRunInParallel">Optional, max numbers of the actions to run in parallel,
/// Must be grater than 0</param>
/// <returns>A Task representing an async operation</returns>
/// <exception cref="ArgumentOutOfRangeException">If the maxActionsToRunInParallel is less than 1</exception>
public static async Task ForEachAsyncConcurrent<T>(
this IEnumerable<T> enumerable,
Func<T, Task> action,
int? maxActionsToRunInParallel = null)
{
if (maxActionsToRunInParallel.HasValue)
{
using (var semaphoreSlim = new SemaphoreSlim(
maxActionsToRunInParallel.Value, maxActionsToRunInParallel.Value))
{
var tasksWithThrottler = new List<Task>();
foreach (var item in enumerable)
{
// Increment the number of currently running tasks and wait if they are more than limit.
await semaphoreSlim.WaitAsync();
tasksWithThrottler.Add(Task.Run(async () =>
{
await action(item).ContinueWith(res =>
{
// action is completed, so decrement the number of currently running tasks
semaphoreSlim.Release();
});
}));
}
// Wait for all of the provided tasks to complete.
await Task.WhenAll(tasksWithThrottler.ToArray());
}
}
else
{
await Task.WhenAll(enumerable.Select(item => action(item)));
}
}
await enumerable.ForEachAsyncConcurrent(
async item =>
{
await SomeAsyncMethod(item);
},
5);
System.Net.ServicePointManager.DefaultConnectionLimit = 20;
public static async Task<TResult[]> ForEachAsync<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, Task<TResult>> action,
int maximumConcurrency = 1,
bool onErrorContinue = false)
{
// Arguments validation omitted
var semaphore = new SemaphoreSlim(maximumConcurrency, maximumConcurrency);
var results = new List<TResult>();
var exceptions = new ConcurrentQueue<Exception>();
int index = 0;
try
{
foreach (var item in source)
{
var localIndex = index++;
lock (results) results.Add(default); // Reserve space in the list
await semaphore.WaitAsync(); // continue on captured context
if (!onErrorContinue && !exceptions.IsEmpty) { semaphore.Release(); break; }
FireAndAwaitTask();
async void FireAndAwaitTask()
{
try
{
var task = action(item);
var result = await task.ConfigureAwait(false);
lock (results) results[localIndex] = result;
}
catch (Exception ex) { exceptions.Enqueue(ex); return; }
finally { semaphore.Release(); }
}
}
}
catch (Exception ex) { exceptions.Enqueue(ex); }
// Wait for all pending operations to complete
for (int i = 0; i < maximumConcurrency; i++)
await semaphore.WaitAsync().ConfigureAwait(false);
if (!exceptions.IsEmpty) throw new AggregateException(exceptions);
lock (results) return results.ToArray();
}
public static Task ForEachAsync<TSource>(
this IEnumerable<TSource> source,
Func<TSource, Task> action,
int maximumConcurrency = 1,
bool onErrorContinue = false)
{
// Arguments validation omitted
return ForEachAsync<TSource, object>(source, async item =>
{
await action(item).ConfigureAwait(false); return null;
}, maximumConcurrency, onErrorContinue);
}
await urls.ForEachAsync(async url =>
{
var html = await httpClient.GetStringAsync(url);
TextBox1.AppendText($"Url: {url}, {html.Length:#,0} chars\r\n");
}, maximumConcurrency: 10, onErrorContinue: true);