F#具有冲突/排序/互斥的异步操作
F#使使用F#具有冲突/排序/互斥的异步操作,f#,async-await,mutex,F#,Async Await,Mutex,F#使使用asyncbuilder定义异步计算变得容易。您可以编写整个程序,然后将其传递给Async.RunSynchronously 我遇到的问题是,某些异步操作不能同时运行;他们应该被迫等待其他异步操作完成。这有点像互斥体。然而,我不想仅仅把它们串联起来,因为这样做效率很低 具体示例:下载缓存 假设我想使用本地文件缓存获取一些远程文件。在我的应用程序中,我在许多地方调用fetchFile:Async,但如果我同时在同一URL上调用fetchFile,缓存将被多次写入损坏。相反,fetchFi
async
builder定义异步计算变得容易。您可以编写整个程序,然后将其传递给Async.RunSynchronously
我遇到的问题是,某些异步
操作不能同时运行;他们应该被迫等待其他异步
操作完成。这有点像互斥体。然而,我不想仅仅把它们串联起来,因为这样做效率很低
具体示例:下载缓存
假设我想使用本地文件缓存获取一些远程文件。在我的应用程序中,我在许多地方调用fetchFile:Async
,但如果我同时在同一URL上调用fetchFile
,缓存将被多次写入损坏。相反,fetchFile
命令应该具有如下行为:
- 如果没有缓存,请将文件下载到缓存,然后读取缓存内容
- 如果缓存当前正在写入,请等待写入完成,然后读取内容
- 如果缓存存在且已完成,则只需读取缓存内容
在两个不同的URL上应该并行工作fetchFile
DownloadManager
类,请求可以在内部发送和排序
F#程序员通常如何使用async
实现这种逻辑
假想用法:
let dm = new DownloadManager()
let urls = [
"https://www.google.com";
"https://www.google.com";
"https://www.wikipedia.org";
"https://www.google.com";
"https://www.bing.com";
]
let results =
urls
|> Seq.map dm.Download
|> Async.Parallel
|> Async.RunSynchronously
注意:我以前问过如何以半并行方式运行
async
操作,但现在我意识到这种方法很难编写
注意:我不必担心应用程序的多个实例同时运行。内存中的锁定就足够了 更新 比惰性值更好的是Petricek建议的
Async.StartChild
,因此我将lazyDownload
更改为asyncDownload
您可以使用
MailboxProcessor
作为处理缓存的下载管理器。MailboxProcessor是F#中的一种结构,它处理一个消息队列,确保没有冲突
首先,您需要一个能够维持以下状态的处理器:
let stateFull hndl initState =
MailboxProcessor.Start(fun inbox ->
let rec loop state : Async<unit> = async {
try let! f = inbox.Receive()
let! newState = f state
return! loop newState
with e -> return! loop (hndl e state)
}
loop initState
)
要调用邮箱,我们需要使用PostAndReply
:
let applyReplyS f (agent: MailboxProcessor<'a->Async<'a>>) =
agent.PostAndReply(fun (reply:AsyncReplyChannel<'r>) ->
fun v -> async {
let st, r = f v
reply.Reply r
return st
})
只是一个返回字符串和计时信息的伪函数
现在,检查缓存的文件夹函数:
let folderCache url cache =
cache
|> Map.tryFind url
|> Option.map(fun ld -> cache, ld)
|> Option.defaultWith (fun () ->
let ld = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
cache |> Map.add url ld, ld
)
最后,我们的下载功能:
let asyncDownload url =
async {
let started = System.DateTime.UtcNow.Ticks
do! Async.Sleep 30
let finished = System.DateTime.UtcNow.Ticks
let r = sprintf "Downloaded %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
printfn "%s" r
return r
}
let downloadUrl url =
downloadManager
|> applyReplyS (folderCache url)
// val downloadUrl: url: string -> Async<string>
我同意@AMieres的观点,邮箱处理器是一种很好的方法。我的代码版本不太通用——它直接使用邮箱处理器来实现这一目的,因此可能会简单一些 我们的邮箱处理器只有一条消息-您要求它下载URL,它会返回一个异步工作流,您可以等待该工作流以获得结果:
type DownloadMessage =
| Download of string * AsyncReplyChannel<Async<string>>
在邮箱处理器中,我们保留一个可变的缓存
(这很好,因为邮箱处理器同步处理消息)。当我们收到下载请求时,我们检查缓存中是否已经有下载-如果没有,我们将作为子async
启动下载并将其添加到缓存中-因此缓存包含表示运行下载结果的异步工作流
let downloadCache = MailboxProcessor.Start(fun inbox -> async {
let cache = System.Collections.Generic.Dictionary<_, _>()
while true do
let! (Download(url, repl)) = inbox.Receive()
if not (cache.ContainsKey url) then
let! proc = asyncDownload url |> Async.StartChild
cache.Add(url, proc)
repl.Reply(cache.[url]) })
我提供了一个基于@Tomas Petricek答案的简化版本
假设我们有一个下载函数,给定一个url返回一个
Async
。这是一个虚拟版本:
let asyncDownload url =
async {
let started = System.DateTime.UtcNow.Ticks
do! Async.Sleep 30
let finished = System.DateTime.UtcNow.Ticks
let r = sprintf "Downloaded %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
printfn "%s" r
return r
}
这里我们在自己的模块中有一些简单的通用邮箱助手函数:
module Mailbox =
let iterA hndl f =
MailboxProcessor.Start(fun inbox ->
async {
while true do
try let! msg = inbox.Receive()
do! f msg
with e -> hndl e
}
)
let callA hndl f = iterA hndl (fun ((replyChannel: AsyncReplyChannel<_>), msg) -> async {
let! r = f msg
replyChannel.Reply r
})
let call hndl f = callA hndl (fun msg -> async { return f msg } )
第一个参数是异常处理程序,第二个参数是返回值的函数。下面是我们如何定义下载管理器的方法:
let downloadManager =
stateFull (fun e s -> printfn "%A" e ; s) (Map.empty : Map<string, _>)
let downloadManager =
let dict = new System.Collections.Generic.Dictionary<string, _>()
Mailbox.call (printfn "%A") (fun url ->
if dict.ContainsKey url then dict.[url] else
let result = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
dict.Add(url, result)
result
)
下面是一个测试:
let s = System.DateTime.UtcNow.Ticks
printfn "started %A" (s / 10000L)
let res =
List.init 50 (fun i -> i, downloadUrl (string <| i % 5) )
|> List.groupBy (snd >> Async.RunSynchronously)
|> List.map (fun (t, ts) -> sprintf "%s - %A" t (ts |> List.map fst ) )
let f = System.DateTime.UtcNow.Ticks
printfn "finish %A" (f / 10000L)
printfn "elapsed %dms" ((f - s) / 10000L)
res |> printfn "Result: \n%A"
我需要一段时间来消化这个!这会并行下载不同的URL吗?只有相同的URL应该排队。如果您所在的系统中此缓存可能会被多个服务器访问,即场中的多个扩展Web服务器,我建议使用Redlock.Net作为分布式锁,而不是简单的邮箱()我添加了一些测试代码,并更改了Async
的惰性。您可以看到它们是如何同时启动的,并且每个Url只执行一次。@AMieres感谢您的更正。我将删除我的评论,以免引起混淆。
let asyncDownload url =
async {
let started = System.DateTime.UtcNow.Ticks
do! Async.Sleep 30
let finished = System.DateTime.UtcNow.Ticks
let r = sprintf "Downloaded %A it took: %dms %s" (started / 10000L) ((finished - started) / 10000L) url
printfn "%s" r
return r
}
module Mailbox =
let iterA hndl f =
MailboxProcessor.Start(fun inbox ->
async {
while true do
try let! msg = inbox.Receive()
do! f msg
with e -> hndl e
}
)
let callA hndl f = iterA hndl (fun ((replyChannel: AsyncReplyChannel<_>), msg) -> async {
let! r = f msg
replyChannel.Reply r
})
let call hndl f = callA hndl (fun msg -> async { return f msg } )
val call:
hndl: exn -> unit ->
f : 'a -> 'b
-> MailboxProcessor<AsyncReplyChannel<'b> * 'a>
let downloadManager =
let dict = new System.Collections.Generic.Dictionary<string, _>()
Mailbox.call (printfn "%A") (fun url ->
if dict.ContainsKey url then dict.[url] else
let result = asyncDownload url |> Async.StartChild |> Async.RunSynchronously
dict.Add(url, result)
result
)
let downloadUrl url = downloadManager.PostAndReply(fun reply -> reply, url)
let s = System.DateTime.UtcNow.Ticks
printfn "started %A" (s / 10000L)
let res =
List.init 50 (fun i -> i, downloadUrl (string <| i % 5) )
|> List.groupBy (snd >> Async.RunSynchronously)
|> List.map (fun (t, ts) -> sprintf "%s - %A" t (ts |> List.map fst ) )
let f = System.DateTime.UtcNow.Ticks
printfn "finish %A" (f / 10000L)
printfn "elapsed %dms" ((f - s) / 10000L)
res |> printfn "Result: \n%A"
started 63676682503885L
Downloaded 63676682503911L it took: 34ms 1
Downloaded 63676682503912L it took: 33ms 2
Downloaded 63676682503911L it took: 37ms 0
Downloaded 63676682503912L it took: 33ms 3
Downloaded 63676682503912L it took: 33ms 4
finish 63676682503994L
elapsed 109ms
Result:
["Downloaded 63676682503911L it took: 37ms 0 - [0; 5; 10; 15; 20; 25; 30; 35; 40; 45]";
"Downloaded 63676682503911L it took: 34ms 1 - [1; 6; 11; 16; 21; 26; 31; 36; 41; 46]";
"Downloaded 63676682503912L it took: 33ms 2 - [2; 7; 12; 17; 22; 27; 32; 37; 42; 47]";
"Downloaded 63676682503912L it took: 33ms 3 - [3; 8; 13; 18; 23; 28; 33; 38; 43; 48]";
"Downloaded 63676682503912L it took: 33ms 4 - [4; 9; 14; 19; 24; 29; 34; 39; 44; 49]"]