F#具有冲突/排序/互斥的异步操作

F#具有冲突/排序/互斥的异步操作,f#,async-await,mutex,F#,Async Await,Mutex,F#使使用asyncbuilder定义异步计算变得容易。您可以编写整个程序,然后将其传递给Async.RunSynchronously 我遇到的问题是,某些异步操作不能同时运行;他们应该被迫等待其他异步操作完成。这有点像互斥体。然而,我不想仅仅把它们串联起来,因为这样做效率很低 具体示例:下载缓存 假设我想使用本地文件缓存获取一些远程文件。在我的应用程序中,我在许多地方调用fetchFile:Async,但如果我同时在同一URL上调用fetchFile,缓存将被多次写入损坏。相反,fetchFi

F#使使用
async
builder定义异步计算变得容易。您可以编写整个程序,然后将其传递给
Async.RunSynchronously

我遇到的问题是,某些
异步
操作不能同时运行;他们应该被迫等待其他
异步
操作完成。这有点像互斥体。然而,我不想仅仅把它们串联起来,因为这样做效率很低

具体示例:下载缓存

假设我想使用本地文件缓存获取一些远程文件。在我的应用程序中,我在许多地方调用
fetchFile:Async
,但如果我同时在同一URL上调用
fetchFile
,缓存将被多次写入损坏。相反,
fetchFile
命令应该具有如下行为:

  • 如果没有缓存,请将文件下载到缓存,然后读取缓存内容
  • 如果缓存当前正在写入,请等待写入完成,然后读取内容
  • 如果缓存存在且已完成,则只需读取缓存内容
  • fetchFile
    在两个不同的URL上应该并行工作
我想象着某种有状态的
DownloadManager
类,请求可以在内部发送和排序

F#程序员通常如何使用
async
实现这种逻辑


假想用法:

let dm = new DownloadManager()

let urls = [
  "https://www.google.com"; 
  "https://www.google.com"; 
  "https://www.wikipedia.org"; 
  "https://www.google.com"; 
  "https://www.bing.com"; 
]

let results = 
  urls
  |> Seq.map dm.Download
  |> Async.Parallel
  |> Async.RunSynchronously

注意:我以前问过如何以半并行方式运行
async
操作,但现在我意识到这种方法很难编写


注意:我不必担心应用程序的多个实例同时运行。内存中的锁定就足够了

更新

比惰性值更好的是Petricek建议的
Async.StartChild
,因此我将
lazyDownload
更改为
asyncDownload


您可以使用
MailboxProcessor
作为处理缓存的下载管理器。MailboxProcessor是F#中的一种结构,它处理一个消息队列,确保没有冲突

首先,您需要一个能够维持以下状态的处理器:

let stateFull hndl initState =
    MailboxProcessor.Start(fun inbox ->
        let rec loop state : Async<unit> = async {
            try         let! f        = inbox.Receive()
                        let! newState = f state
                        return! loop newState
            with e ->   return! loop (hndl e state)
        }
        loop initState
    )
要调用邮箱,我们需要使用
PostAndReply

let applyReplyS f (agent: MailboxProcessor<'a->Async<'a>>) = 
    agent.PostAndReply(fun (reply:AsyncReplyChannel<'r>) -> 
        fun v -> async {
            let st, r = f v
            reply.Reply r
            return st 
        })
只是一个返回字符串和计时信息的伪函数

现在,检查缓存的文件夹函数:

let folderCache url cache  =
    cache 
    |> Map.tryFind url
    |> Option.map(fun ld -> cache, ld)
    |> Option.defaultWith (fun () -> 
        let ld = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
        cache |> Map.add url ld, ld
    )
最后,我们的下载功能:

let asyncDownload url = 
    async { 
        let started = System.DateTime.UtcNow.Ticks
        do! Async.Sleep 30
        let finished = System.DateTime.UtcNow.Ticks
        let r = sprintf "Downloaded  %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
        printfn "%s" r
        return r
    }
let downloadUrl url =
    downloadManager 
    |> applyReplyS (folderCache url)

// val downloadUrl: url: string -> Async<string>

我同意@AMieres的观点,邮箱处理器是一种很好的方法。我的代码版本不太通用——它直接使用邮箱处理器来实现这一目的,因此可能会简单一些

我们的邮箱处理器只有一条消息-您要求它下载URL,它会返回一个异步工作流,您可以等待该工作流以获得结果:

type DownloadMessage = 
  | Download of string * AsyncReplyChannel<Async<string>>
在邮箱处理器中,我们保留一个可变的
缓存
(这很好,因为邮箱处理器同步处理消息)。当我们收到下载请求时,我们检查缓存中是否已经有下载-如果没有,我们将作为子
async
启动下载并将其添加到缓存中-因此缓存包含表示运行下载结果的异步工作流

let downloadCache = MailboxProcessor.Start(fun inbox -> async {
  let cache = System.Collections.Generic.Dictionary<_, _>()
  while true do
    let! (Download(url, repl)) = inbox.Receive()
    if not (cache.ContainsKey url) then 
      let! proc = asyncDownload url |> Async.StartChild
      cache.Add(url, proc)
    repl.Reply(cache.[url]) })

我提供了一个基于@Tomas Petricek答案的简化版本


假设我们有一个下载函数,给定一个url返回一个
Async
。这是一个虚拟版本:

let asyncDownload url = 
    async { 
        let started = System.DateTime.UtcNow.Ticks
        do! Async.Sleep 30
        let finished = System.DateTime.UtcNow.Ticks
        let r = sprintf "Downloaded  %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
        printfn "%s" r
        return r
    }
这里我们在自己的模块中有一些简单的通用
邮箱
助手函数:

module Mailbox =
    let iterA hndl f =
        MailboxProcessor.Start(fun inbox ->
            async {
                while true do
                    try       let!   msg = inbox.Receive()
                              do!  f msg
                    with e -> hndl e
            }
        )
    let callA hndl f = iterA hndl (fun ((replyChannel: AsyncReplyChannel<_>), msg) -> async {
        let! r = f msg
        replyChannel.Reply r
    })
    let call hndl f = callA hndl (fun msg -> async { return f msg } )
第一个参数是异常处理程序,第二个参数是返回值的函数。下面是我们如何定义下载管理器的方法:

let downloadManager = 
    stateFull (fun e s -> printfn "%A" e ; s) (Map.empty : Map<string, _>)
let downloadManager = 
    let dict = new System.Collections.Generic.Dictionary<string, _>()
    Mailbox.call (printfn "%A") (fun url ->         
        if dict.ContainsKey url then dict.[url] else
        let result = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
        dict.Add(url, result)
        result
    )
下面是一个测试:

let s = System.DateTime.UtcNow.Ticks
printfn "started %A" (s / 10000L)
let res = 
    List.init 50 (fun i -> i, downloadUrl (string <| i % 5) )
    |> List.groupBy (snd >> Async.RunSynchronously)
    |> List.map (fun (t, ts) -> sprintf "%s - %A" t (ts |> List.map fst ) )

let f = System.DateTime.UtcNow.Ticks
printfn "finish  %A" (f / 10000L)

printfn "elapsed %dms" ((f - s) / 10000L)

res |> printfn "Result: \n%A"

我需要一段时间来消化这个!这会并行下载不同的URL吗?只有相同的URL应该排队。如果您所在的系统中此缓存可能会被多个服务器访问,即场中的多个扩展Web服务器,我建议使用Redlock.Net作为分布式锁,而不是简单的邮箱()我添加了一些测试代码,并更改了
Async
惰性
。您可以看到它们是如何同时启动的,并且每个Url只执行一次。@AMieres感谢您的更正。我将删除我的评论,以免引起混淆。
let asyncDownload url = 
    async { 
        let started = System.DateTime.UtcNow.Ticks
        do! Async.Sleep 30
        let finished = System.DateTime.UtcNow.Ticks
        let r = sprintf "Downloaded  %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
        printfn "%s" r
        return r
    }
module Mailbox =
    let iterA hndl f =
        MailboxProcessor.Start(fun inbox ->
            async {
                while true do
                    try       let!   msg = inbox.Receive()
                              do!  f msg
                    with e -> hndl e
            }
        )
    let callA hndl f = iterA hndl (fun ((replyChannel: AsyncReplyChannel<_>), msg) -> async {
        let! r = f msg
        replyChannel.Reply r
    })
    let call hndl f = callA hndl (fun msg -> async { return f msg } )
val call: 
   hndl: exn -> unit ->
   f   : 'a -> 'b    
      -> MailboxProcessor<AsyncReplyChannel<'b> * 'a>
let downloadManager = 
    let dict = new System.Collections.Generic.Dictionary<string, _>()
    Mailbox.call (printfn "%A") (fun url ->         
        if dict.ContainsKey url then dict.[url] else
        let result = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
        dict.Add(url, result)
        result
    )
let downloadUrl url = downloadManager.PostAndReply(fun reply -> reply, url)
let s = System.DateTime.UtcNow.Ticks
printfn "started %A" (s / 10000L)
let res = 
    List.init 50 (fun i -> i, downloadUrl (string <| i % 5) )
    |> List.groupBy (snd >> Async.RunSynchronously)
    |> List.map (fun (t, ts) -> sprintf "%s - %A" t (ts |> List.map fst ) )

let f = System.DateTime.UtcNow.Ticks
printfn "finish  %A" (f / 10000L)

printfn "elapsed %dms" ((f - s) / 10000L)

res |> printfn "Result: \n%A"
started 63676682503885L
Downloaded  63676682503911L it took: 34ms 1
Downloaded  63676682503912L it took: 33ms 2
Downloaded  63676682503911L it took: 37ms 0
Downloaded  63676682503912L it took: 33ms 3
Downloaded  63676682503912L it took: 33ms 4
finish  63676682503994L
elapsed 109ms
Result: 
["Downloaded  63676682503911L it took: 37ms 0 - [0; 5; 10; 15; 20; 25; 30; 35; 40; 45]";
 "Downloaded  63676682503911L it took: 34ms 1 - [1; 6; 11; 16; 21; 26; 31; 36; 41; 46]";
 "Downloaded  63676682503912L it took: 33ms 2 - [2; 7; 12; 17; 22; 27; 32; 37; 42; 47]";
 "Downloaded  63676682503912L it took: 33ms 3 - [3; 8; 13; 18; 23; 28; 33; 38; 43; 48]";
 "Downloaded  63676682503912L it took: 33ms 4 - [4; 9; 14; 19; 24; 29; 34; 39; 44; 49]"]