Parallel processing 分布式并行映射（pmap），每个映射任务都有一个超时/时间限制来完成_Parallel Processing_Julia_Interrupt_Distributed Computing_Pmap

Parallel processing 分布式并行映射（pmap），每个映射任务都有一个超时/时间限制来完成

parallel-processing julia

Parallel processing 分布式并行映射（pmap），每个映射任务都有一个超时/时间限制来完成,parallel-processing,julia,interrupt,distributed-computing,pmap,Parallel Processing,Julia,Interrupt,Distributed Computing,Pmap,我的项目涉及使用Julia函数并行计算地图映射一个给定的元素可能需要几秒钟，也可能需要很长时间。我需要一个超时或时间限制来完成单个映射任务/计算如果地图任务及时完成，很好，返回计算结果。如果任务未在时间限制内完成，请在达到时间限制后停止计算，并返回一些值或消息，指示发生超时下面是一个简单的例子。首先是导入的模块，然后启动工作进程： num_procs = 1 using Distributed if num_procs > 1 # The main process (no c

我的项目涉及使用Julia函数并行计算地图

映射一个给定的元素可能需要几秒钟，也可能需要很长时间。我需要一个超时或时间限制来完成单个映射任务/计算

如果地图任务及时完成，很好，返回计算结果。如果任务未在时间限制内完成，请在达到时间限制后停止计算，并返回一些值或消息，指示发生超时

下面是一个简单的例子。首先是导入的模块，然后启动工作进程：

num_procs = 1
using Distributed
if num_procs > 1
    # The main process (no calling addprocs) can be used for `pmap`:
    addprocs(num_procs-1)
end

接下来，为所有工作进程定义映射任务。映射任务应在1秒后超时：

@everywhere import Random
@everywhere begin
    """
    Compute stuff for `wait_time` seconds, and return `wait_time`.
    If `timeout` seconds elapses, stop computation and return something else.
    """
    function waitForTimeUnlessTimeout(wait_time, timeout=1)

        # < Insert some sort of timeout code? >

        # This block of code simulates a long computation.
        # (pretend the computation time is unknown)
        x = 0
        while time()-t0 < wait_time
            x += Random.rand() - 0.5
        end

        # computation completed before time limit. Return wait_time.
        round(wait_time, digits=2)
    end
end

这个有时间限制的并行映射应该如何实现？

您可以在

pmap

主体中放置类似的内容

pmap(runtimes) do runtime
  t0 = time()
  task = @async waitForTimeUnlessTimeout(runtime)
  while !istaskdone(task) && time()-t0 < time_limit
      sleep(1)
  end
  istaskdone(task) && (return fetch(task))
  error("time over")
end

pmap（运行时）执行运行时
t0=时间（）
task=@async waitForTimeUnlessTimeout（运行时）
虽然istaskdone（任务）和时间（）-t0<时间限制
睡眠（1）
结束
istaskdone（任务）&（返回获取（任务））
错误（“时间结束”）
结束

还要注意的是，

（运行时）->waitForTimeUnlessTimeout（运行时）

与@Fredrik Bagge的非常有用的答案之后的

waitForTimeUnlessTimeout

相同，下面是完整的工作示例实现，并有一些额外的解释

num_procs = 8
using Distributed
if num_procs > 1
    addprocs(num_procs-1)
end

@everywhere import Random
@everywhere begin
    function waitForTime(wait_time)
         # This code block simulates a long computation.
         # Pretend the computation time is unknown.
        t0 = time()
        x = 0
        while time()-t0 < wait_time
            x += Random.rand() - 0.5
            yield() # CRITICAL to release computation to check if task is done.
            # If you comment out #yield(), you will see timeout doesn't work!
        end

        return round(wait_time, digits=2)
    end
end

function myParallelMapping(num_tasks = 16, max_runtime=2, time_limit=1)
    # random task runtimes between 0 and max_runtime
    runtimes = Random.rand(num_tasks) * max_runtime

    # parallel compute the mapping tasks. See "do block" in 
    # the Julia documentation, it's just syntactic sugar.
    return pmap(runtimes) do runtime
                  t0 = time()
                  task = @async waitForTime(runtime)
                  while !istaskdone(task) && time()-t0 < time_limit
                      # releases computation to waitForTime
                      sleep(0.1)
                      # nothing past here will run until waitForTime calls yield()
                      # *and* 0.1 seconds have passed.
                  end
                  # equal to if istaskdone(task); return fetch(task); end
                  istaskdone(task) && (return fetch(task))
                  return "TimeOut"
                  # `return error("TimeOut")` halts pmap unless pmap is
                  #  given an error handler argument. See pmap documentation.
              end
end

请注意，在本例中，每个流程有两个任务。原始任务（“时间检查器”）每0.1秒检查一次其他任务是否已完成计算。另一个任务（使用

@async

创建）是计算一些东西，定期调用

yield（）

来释放对时间检查器的控制；如果它不调用

yield（）

，则无法进行时间检查。

我已经让它工作了。默认情况下，抛出错误会完全停止

pmap

，但我认为

pmap

有一个可选的错误处理程序参数，或者如果超时，可以只返回一个非错误值。非常感谢你！！需要记住的一点是，

sleep

实际上将计算控制释放给调度程序。在我最初的示例中，我使用

sleep

inside

waitForTimeUnlessTimeout

表示长时间的计算，但我将示例更改为不使用

sleep

。这意味着在调用

yield（）

之前，

waitForTimeUnlessTimout

不会向调度程序释放控制权（就像实际计算一样）。更新了使用您答案的我的答案。

num_procs = 8
using Distributed
if num_procs > 1
    addprocs(num_procs-1)
end

@everywhere import Random
@everywhere begin
    function waitForTime(wait_time)
         # This code block simulates a long computation.
         # Pretend the computation time is unknown.
        t0 = time()
        x = 0
        while time()-t0 < wait_time
            x += Random.rand() - 0.5
            yield() # CRITICAL to release computation to check if task is done.
            # If you comment out #yield(), you will see timeout doesn't work!
        end

        return round(wait_time, digits=2)
    end
end

function myParallelMapping(num_tasks = 16, max_runtime=2, time_limit=1)
    # random task runtimes between 0 and max_runtime
    runtimes = Random.rand(num_tasks) * max_runtime

    # parallel compute the mapping tasks. See "do block" in 
    # the Julia documentation, it's just syntactic sugar.
    return pmap(runtimes) do runtime
                  t0 = time()
                  task = @async waitForTime(runtime)
                  while !istaskdone(task) && time()-t0 < time_limit
                      # releases computation to waitForTime
                      sleep(0.1)
                      # nothing past here will run until waitForTime calls yield()
                      # *and* 0.1 seconds have passed.
                  end
                  # equal to if istaskdone(task); return fetch(task); end
                  istaskdone(task) && (return fetch(task))
                  return "TimeOut"
                  # `return error("TimeOut")` halts pmap unless pmap is
                  #  given an error handler argument. See pmap documentation.
              end
end

julia> print(myParallelMapping())

       Any["TimeOut", "TimeOut", 0.33, 0.35, 0.56, 0.41, 0.08, 0.14, 0.72, 
           "TimeOut", "TimeOut", "TimeOut", 0.52, "TimeOut", 0.33, "TimeOut"]