Parallel processing MMap和sharedarray

Parallel processing MMap和sharedarray,parallel-processing,julia,distributed,Parallel Processing,Julia,Distributed,我在并行一些函数时遇到了一个非常奇怪的问题。我知道我应该发布一个MWE,但我不能在一个简单的问题中重现这个问题 @everywhere function simulSample(Profits,ν,J,TerminalT,RelevantT,N,S,params,thresh;RelevantPer=RelevantPeriod) Dₑ = rand([0],J,1) # Final Condition Dᵢ = rand([0],J,1) # Initial Condition

我在并行一些函数时遇到了一个非常奇怪的问题。我知道我应该发布一个MWE,但我不能在一个简单的问题中重现这个问题

@everywhere function simulSample(Profits,ν,J,TerminalT,RelevantT,N,S,params,thresh;RelevantPer=RelevantPeriod)
    Dₑ = rand([0],J,1) # Final Condition
    Dᵢ = rand([0],J,1) # Initial Condition
    tempData=SharedArray{Int8}(J,RelevantT,S*N)
    @inbounds @sync @distributed for n=1:S*N
    #for n=1:N
        #println(n)
        tempData[:,:,n]=solver(Dᵢ,Dₑ,Profits[:,:,n],continent,cont_lang,language,J,TerminalT,RelevantT,RelevantPeriod,params,thresh,ν[:,:,n])
    end
    return tempData
end
此函数正由迭代过程中的另一个函数调用。在第一次迭代中它是有效的,但是在第二次迭代中我得到了以下错误

SystemError: mmap: The operation completed successfully. 
#windowserror#45(::Nothing, ::typeof(Base.windowserror), ::Symbol, ::Bool) at error.jl:148
windowserror at error.jl:148 [inlined]
#mmap#1(::Bool, ::Bool, ::typeof(Mmap.mmap), ::Mmap.Anonymous, ::Type{Array{Int8,3}}, ::Tuple{Int64,Int64,Int64}, ::Int64) at Mmap.jl:221
mmap(::Mmap.Anonymous, ::Type{Array{Int8,3}}, ::Tuple{Int64,Int64,Int64}, ::Int64) at Mmap.jl:186
_shm_mmap_array(::Type, ::Tuple{Int64,Int64,Int64}, ::String, ::UInt16) at SharedArrays.jl:670
shm_mmap_array(::Type, ::Tuple{Int64,Int64,Int64}, ::String, ::UInt16) at SharedArrays.jl:649
#call#3(::Bool, ::Array{Int64,1}, ::Type{SharedArray{Int8,3}}, ::Tuple{Int64,Int64,Int64}) at SharedArrays.jl:118
Type at SharedArrays.jl:105 [inlined]
#call#10 at SharedArrays.jl:161 [inlined]
Type at SharedArrays.jl:161 [inlined]
#call#15 at SharedArrays.jl:171 [inlined]
SharedArray{Int8,N} where N(::Int64, ::Int64, ::Int64) at SharedArrays.jl:171
#simulatedMoments#67(::Float64, ::Int64, ::typeof(simulatedMoments), ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::Int64, ::Int64, ::Int64, ::Int64, ::Array{Any,3}, ::Array{Float64,1}, ::Int64, ::Int64, ::Int64, ::Float64) at 13-Estimation_Stable.jl:1255
simulatedMoments at 13-Estimation_Stable.jl:1232 [inlined]
#gmm_fun#70(::Float64, ::Float64, ::Int64, ::typeof(gmm_fun), ::SharedArray{Int64,3}, ::Array{Any,3}, ::Array{Float64,1}, ::Array{Float64,1}, ::Int64, ::Int64, ::Int64, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Array,1}, ::Array{Array,1}, ::Array{Array,1}, ::Array{Array,1}, ::Array{Float64,2}) at 13-Estimation_Stable.jl:1297
gmm_fun(::SharedArray{Int64,3}, ::Array{Any,3}, ::Array{Float64,1}, ::Array{Float64,1}, ::Int64, ::Int64, ::Int64, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Float64,4}, ::Array{Array,1}, ::Array{Array,1}, ::Array{Array,1}, ::Array{Array,1}, ::Array{Float64,2}) at 13-Estimation_Stable.jl:1284
obj_function_final(::Array{Float64,1}, ::Array{Any,1}) at 13-Estimation_Stable.jl:1370
top-level scope at util.jl:156
我正在添加一个带有错误的MWE,我希望它能使事情更加透明:

using Distributed, SharedArrays

rmprocs()
addprocs()
big_array = rand(100,11,20000)

function donothing(a)
   shared_array = convert(SharedArray,a)
end

for i=1:1000
    donothing(big_array)
end


错误如下:

SystemError: mmap: The operation completed successfully. 
#windowserror#45(::Nothing, ::typeof(Base.windowserror), ::Symbol, ::Bool) at error.jl:148
windowserror at error.jl:148 [inlined]
#mmap#1(::Bool, ::Bool, ::typeof(Mmap.mmap), ::Mmap.Anonymous, ::Type{Array{Float64,3}}, ::Tuple{Int64,Int64,Int64}, ::Int64) at Mmap.jl:218
mmap(::Mmap.Anonymous, ::Type{Array{Float64,3}}, ::Tuple{Int64,Int64,Int64}, ::Int64) at Mmap.jl:186
_shm_mmap_array(::Type, ::Tuple{Int64,Int64,Int64}, ::String, ::UInt16) at SharedArrays.jl:670
shm_mmap_array(::Type, ::Tuple{Int64,Int64,Int64}, ::String, ::UInt16) at SharedArrays.jl:649
#call#3(::Bool, ::Array{Int64,1}, ::Type{SharedArray{Float64,3}}, ::Tuple{Int64,Int64,Int64}) at SharedArrays.jl:118
Type at SharedArrays.jl:105 [inlined]
Type at SharedArrays.jl:357 [inlined]
convert at SharedArrays.jl:369 [inlined]
donothing(::Array{Float64,3}) at mwproblem.jl:8
top-level scope at mwproblem.jl:12

以下是一个场景的MWE模式,应该与您的类似:

using Distributed
addprocs(4)
using SharedArrays


@everywhere function compute()
    rand() + myid()*100
end

function dothejob()
    s = SharedArray(zeros(10000000))
    @sync @distributed for i in 1:10000000
        s[i] = compute()
    end
    s
end

myres = dothejob();

您的职能是将工作负载分配给所有员工。它看起来像一个通常从主节点调用的函数。为什么它前面到处都是
@
?你是从工人那里打过来的吗?嗨!非常感谢您的回复。我是Julia的初学者,我不太确定我应该在哪里写,在哪里写。在任何情况下,都应该从主节点调用它。您认为这可以解决问题吗?
@everywhere
在主节点和所有工作节点上定义一个函数。对于大多数场景(很可能是您的场景),您希望使用master在worker上协调并行作业。试着从我答案中的代码开始。如果可能有用的话,我发布了我得到的完整错误。尝试一个文件支持的SharedArray(请参阅)。如果它解决了问题,就意味着是内存问题。为了确保这一点,我不应该在任何时候完成
finalize(s)
?恐怕它仍然不工作。我做了你建议的更正,事实上,我的代码看起来就像你建议的MWE。第二次运行代码时,出现错误
SystemError:mmap操作成功完成
@Przemyslaw。在MWE中,它不会发生。也许你的内存用完了?尝试监视使用情况。考虑重用<代码> s>代码>对象。如果不是内存的问题,那么您可能需要在非工作代码和工作代码之间平分,并找到导致问题的代码行。我相信,正如您所指出的,问题一定是内存的问题。然而,奇怪的是,它第一次运行,而不是第二次。我“尝试”在本地范围内编写SharedArray,以便每次循环结束时都将其删除,但这似乎没有发生。