Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/codeigniter/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
C# 并行处理URL集合并返回IEnumerable_C#_Parallel Processing_Web Scraping_System.reactive - Fatal编程技术网

C# 并行处理URL集合并返回IEnumerable

C# 并行处理URL集合并返回IEnumerable,c#,parallel-processing,web-scraping,system.reactive,C#,Parallel Processing,Web Scraping,System.reactive,我有一个用于抓取的URL集合,我想并行下载资源,同时返回一个强类型结果集合 具有WebClient.DownloadString()和“MyTypedResult进程(字符串)” 如何包装它以进行string[]url=>IEnumerable转换 string[] urls = {"url1","url2","url3"}; List<MyTypedResult> ResultCollection = new List<MyTypedResult>(); foreach

我有一个用于抓取的URL集合,我想并行下载资源,同时返回一个强类型结果集合

具有
WebClient.DownloadString()
和“
MyTypedResult进程(字符串)

如何包装它以进行
string[]url=>IEnumerable
转换

string[] urls = {"url1","url2","url3"};
List<MyTypedResult> ResultCollection = new List<MyTypedResult>();
foreach (var u in urls)
{
    WebClient wc = new WebClient();
    var content = wc.DownloadString(u);
    MyTypedResult r = Process(content);
    ResultCollection.Add(r);
}
string[]url={“url1”、“url2”、“url3”};
List ResultCollection=新列表();
foreach(URL中的变量u)
{
WebClient wc=新的WebClient();
var content=wc.DownloadString(u);
MyTypedResult r=过程(内容);
结果收集。添加(r);
}

我希望web请求并行运行,但我需要列表中的结果集合

您可以使用.NET 4.5中的新toy
HttpClient
并行获得结果:

var httpClient = new HttpClient();

var tasks = urls.Select(url => httpClient.GetStringAsync(url)
                        .ContinueWith(task =>
                        {
                            string response = task.Result;
                            return ConvertToStrongType(response);
                        }));

 Task.WaitAll(tasks.ToArray());
 var results = tasks.Select(t => t.Result);

下面是代码,它使用Parallel.ForEach从url并行下载内容。 您需要使用ConcurrentList来确保集合应该并行填充,而不会出现线程锁定问题

void YourTask()
{
    string[] urls = {"url1","url2","url3"};
    ConcurrentList<MyTypedResult> ResultCollection = new ConcurrentList<MyTypedResult>();

    Parallel.ForEach(urls, url => 
    {
        GetData(url);
        ResultCollection.TryAdd(myTypedResult);
    );

    //on this line all parallel task will be completed and ResultCollection will be filled with the results

}

MyTypedResult GetData(string url)
{
   WebClient wc = new WebClient();
    var content = wc.DownloadString(url);
    MyTypedResult r = Process(content);
    return r;
}
void YourTask()
{
字符串[]URL={“url1”、“url2”、“url3”};
ConcurrentList ResultCollection=新ConcurrentList();
Parallel.ForEach(url,url=>
{
获取数据(url);
结果收集.TryAdd(myTypedResult);
);
//在这一行中,所有并行任务都将完成,ResultCollection将被结果填充
}
MyTypedResult获取数据(字符串url)
{
WebClient wc=新的WebClient();
var content=wc.DownloadString(url);
MyTypedResult r=过程(内容);
返回r;
}

这是带有
HttpClient的Rx版本

var urls = new[] { "url1", "url2", "url3" };
var client = new HttpClient();
var results = from url in urls.ToObservable()
              from content in client.GetStringAsync(url).ToObservable()
              select Process(content);
var enumerable = results.ToEnumerable();

现在还不清楚你在问什么。请展示更多的代码。哇,这看起来很简单。我来试一试,看看它能把我带到哪里。现在,如果有办法并行运行这些下载吗?@Alexander Taran:当然,使用GetStringAsync,让代码并行工作。好吧,看看fiddler中的请求,它们肯定是o一个接一个地出现。@AlexanderTaran:当然你会在filder中看到一个接一个的,但是服务器上的代码使用任务进程url,这几乎意味着每个请求的每个线程。图片:发送第一个请求,代码不会等待响应返回到进程,它会继续发送第二个请求,就像你的意思一样,正确的?