Python 跟踪joblib.并行执行的进度_Python_Multithreading_Parallel Processing_Multiprocessing_Joblib

Python 跟踪joblib.并行执行的进度

python multithreading parallel-processing

Python 跟踪joblib.并行执行的进度,python,multithreading,parallel-processing,multiprocessing,joblib,Python,Multithreading,Parallel Processing,Multiprocessing,Joblib,有没有一种简单的方法来跟踪执行的总体进度我有一个由数千个作业组成的长时间运行的执行，我希望在数据库中跟踪和记录这些作业。然而，要做到这一点，每当Parallel完成任务时，我需要它执行回调，报告剩余的作业数量我以前使用Python的stdlib multiprocessing.Pool完成过类似的任务，方法是启动一个线程，记录Pool的作业列表中挂起的作业数看看代码，Parallel继承了Pool，所以我想我也可以完成同样的技巧，但它似乎没有使用这些列表，我也无法找到其他方法来“读取”它的

有没有一种简单的方法来跟踪执行的总体进度

我有一个由数千个作业组成的长时间运行的执行，我希望在数据库中跟踪和记录这些作业。然而，要做到这一点，每当Parallel完成任务时，我需要它执行回调，报告剩余的作业数量

我以前使用Python的stdlib multiprocessing.Pool完成过类似的任务，方法是启动一个线程，记录Pool的作业列表中挂起的作业数

看看代码，Parallel继承了Pool，所以我想我也可以完成同样的技巧，但它似乎没有使用这些列表，我也无法找到其他方法来“读取”它的内部状态。

您链接到的文档说明，

Parallel

有一个可选的进度表。它是通过使用

多处理.Pool.apply\u async

提供的

回调

关键字参数实现的：

# This is inside a dispatch function
self._lock.acquire()
job = self._pool.apply_async(SafeFunction(func), args,
            kwargs, callback=CallBack(self.n_dispatched, self))
self._jobs.append(job)
self.n_dispatched += 1

下面是

打印进度

：

def print_progress(self, index):
    elapsed_time = time.time() - self._start_time

    # This is heuristic code to print only 'verbose' times a messages
    # The challenge is that we may not know the queue length
    if self._original_iterable:
        if _verbosity_filter(index, self.verbose):
            return
        self._print('Done %3i jobs       | elapsed: %s',
                    (index + 1,
                     short_format_time(elapsed_time),
                    ))
    else:
        # We are finished dispatching
        queue_length = self.n_dispatched
        # We always display the first loop
        if not index == 0:
            # Display depending on the number of remaining items
            # A message as soon as we finish dispatching, cursor is 0
            cursor = (queue_length - index + 1
                      - self._pre_dispatch_amount)
            frequency = (queue_length // self.verbose) + 1
            is_last_item = (index + 1 == queue_length)
            if (is_last_item or cursor % frequency):
                return
        remaining_time = (elapsed_time / (index + 1) *
                    (self.n_dispatched - index - 1.))
        self._print('Done %3i out of %3i | elapsed: %s remaining: %s',
                    (index + 1,
                     queue_length,
                     short_format_time(elapsed_time),
                     short_format_time(remaining_time),
                    ))

老实说，他们实现这一点的方式有点奇怪——似乎假设任务总是按照开始的顺序完成的。转到

print\u progress

的

index

变量只是作业实际启动时发出的

self.n\u

变量。因此，启动的第一个作业将始终以0的

索引完成，即使第三个作业首先完成。这也意味着他们实际上并没有记录完成工作的数量。因此，没有可供监视的实例变量
我认为您最好是创建自己的回调类，并与monkey patch并行：
from math import sqrt
from collections import defaultdict
from joblib import Parallel, delayed

class CallBack(object):
    completed = defaultdict(int)

    def __init__(self, index, parallel):
        self.index = index
        self.parallel = parallel

    def __call__(self, index):
        CallBack.completed[self.parallel] += 1
        print("done with {}".format(CallBack.completed[self.parallel]))
        if self.parallel._original_iterable:
            self.parallel.dispatch_next()

import joblib.parallel
joblib.parallel.CallBack = CallBack

if __name__ == "__main__":
    print(Parallel(n_jobs=2)(delayed(sqrt)(i**2) for i in range(10)))

输出：
done with 1
done with 2
done with 3
done with 4
done with 5
done with 6
done with 7
done with 8
done with 9
done with 10
[0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]

这样，每当作业完成时都会调用回调函数，而不是默认的回调函数。
以下是对您的问题的另一个回答，语法如下：
aprun = ParallelExecutor(n_jobs=5)

a1 = aprun(total=25)(delayed(func)(i ** 2 + j) for i in range(5) for j in range(5))
a2 = aprun(total=16)(delayed(func)(i ** 2 + j) for i in range(4) for j in range(4))
a2 = aprun(bar='txt')(delayed(func)(i ** 2 + j) for i in range(4) for j in range(4))
a2 = aprun(bar=None)(delayed(func)(i ** 2 + j) for i in range(4) for j in range(4))

扩展dano关于最新版本joblib库的答案。对内部实现进行了一些更改
from joblib import Parallel, delayed
from collections import defaultdict

# patch joblib progress callback
class BatchCompletionCallBack(object):
  completed = defaultdict(int)

  def __init__(self, time, index, parallel):
    self.index = index
    self.parallel = parallel

  def __call__(self, index):
    BatchCompletionCallBack.completed[self.parallel] += 1
    print("done with {}".format(BatchCompletionCallBack.completed[self.parallel]))
    if self.parallel._original_iterator is not None:
      self.parallel.dispatch_next()

import joblib.parallel
joblib.parallel.BatchCompletionCallBack = BatchCompletionCallBack

文本进度条
对于那些想要文本进度条而不需要像TQM这样的附加模块的人来说，还有一个变体。2018年4月16日linux上的joblib=0.11和python 3.5.2的实际值，并在子任务完成时显示进度
重新定义本机类：
class BatchCompletionCallBack(object):
    # Added code - start
    global total_n_jobs
    # Added code - end
    def __init__(self, dispatch_timestamp, batch_size, parallel):
        self.dispatch_timestamp = dispatch_timestamp
        self.batch_size = batch_size
        self.parallel = parallel

    def __call__(self, out):
        self.parallel.n_completed_tasks += self.batch_size
        this_batch_duration = time.time() - self.dispatch_timestamp

        self.parallel._backend.batch_completed(self.batch_size,
                                           this_batch_duration)
        self.parallel.print_progress()
        # Added code - start
        progress = self.parallel.n_completed_tasks / total_n_jobs
        print(
            "\rProgress: [{0:50s}] {1:.1f}%".format('#' * int(progress * 50), progress*100)
            , end="", flush=True)
        if self.parallel.n_completed_tasks == total_n_jobs:
            print('\n')
        # Added code - end
        if self.parallel._original_iterator is not None:
            self.parallel.dispatch_next()

import joblib.parallel
joblib.parallel.BatchCompletionCallBack = BatchCompletionCallBack

使用作业总数前定义全局常量：
total_n_jobs = 10

这将导致如下结果：
Progress: [########################################          ] 80.0%

为什么不能简单地使用tqdm
？以下几点对我有用
from joblib import Parallel, delayed
from datetime import datetime
from tqdm import tqdm

def myfun(x):
    return x**2

results = Parallel(n_jobs=8)(delayed(myfun)(i) for i in tqdm(range(1000))
100%|██████████| 1000/1000 [00:00<00:00, 10563.37it/s]

从joblib并行导入，延迟
从日期时间导入日期时间
从TQM导入TQM
def myfun（x）：
返回x**2
结果=并行（n_作业=8）（延迟（myfun）（i）用于tqdm中的i（范围（1000））
100%|██████████| 在Jupyter TQM中，1000/1000[00:00在每次输出时在输出中启动一个新行。
因此，对于Jupyter笔记本电脑，它将是：
用于Jupyter笔记本电脑。
不睡觉：
from joblib import Parallel, delayed
from datetime import datetime
from tqdm import notebook

def myfun(x):
    return x**2

results = Parallel(n_jobs=8)(delayed(myfun)(i) for i in notebook.tqdm(range(1000)))  

100%1000/1000[00:06dano和Connor answers的另一个进步是将整个过程包装为上下文管理器：
import contextlib
import joblib
from tqdm import tqdm    
from joblib import Parallel, delayed

@contextlib.contextmanager
def tqdm_joblib(tqdm_object):
    """Context manager to patch joblib to report into tqdm progress bar given as argument"""
    class TqdmBatchCompletionCallback(joblib.parallel.BatchCompletionCallBack):
        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)

        def __call__(self, *args, **kwargs):
            tqdm_object.update(n=self.batch_size)
            return super().__call__(*args, **kwargs)

    old_batch_callback = joblib.parallel.BatchCompletionCallBack
    joblib.parallel.BatchCompletionCallBack = TqdmBatchCompletionCallback
    try:
        yield tqdm_object
    finally:
        joblib.parallel.BatchCompletionCallBack = old_batch_callback
        tqdm_object.close()    

然后，您可以像这样使用它，并且在完成以下操作后，不要留下修补过的代码：
with tqdm_joblib(tqdm(desc="My calculation", total=10)) as progress_bar:
    Parallel(n_jobs=16)(delayed(sqrt)(i**2) for i in range(10))

我认为这非常棒，它看起来与TQM熊猫集成类似。
TLDR解决方案：
使用python 3.5与joblib 0.14.0和tqdm 4.46.0配合使用。上下文库建议归功于frenzykryger，猴子补丁想法归功于dano和Connor
import contextlib
import joblib
from tqdm import tqdm
from joblib import Parallel, delayed

@contextlib.contextmanager
def tqdm_joblib(tqdm_object):
    """Context manager to patch joblib to report into tqdm progress bar given as argument"""

    def tqdm_print_progress(self):
        if self.n_completed_tasks > tqdm_object.n:
            n_completed = self.n_completed_tasks - tqdm_object.n
            tqdm_object.update(n=n_completed)

    original_print_progress = joblib.parallel.Parallel.print_progress
    joblib.parallel.Parallel.print_progress = tqdm_print_progress

    try:
        yield tqdm_object
    finally:
        joblib.parallel.Parallel.print_progress = original_print_progress
        tqdm_object.close()

您可以使用frenzykryger描述的相同方法
import time
def some_method(wait_time):
    time.sleep(wait_time)

with tqdm_joblib(tqdm(desc="My method", total=10)) as progress_bar:
    Parallel(n_jobs=2)(delayed(some_method)(0.2) for i in range(10))

详细解释：
Jon的解决方案很容易实现，但它只测量已调度的任务。如果任务花费很长时间，则在等待最后一个已调度任务完成执行时，条将停留在100%
frenzykryger的上下文管理器方法（从dano和Connor改进而来）更好，但是BatchCompletionCallBack
也可以在任务完成之前用ImmediateResult
调用（请参阅）。这将使我们得到一个超过100%的计数
我们可以在Parallel
中修补print\u progress
函数，而不是用猴子来修补BatchCompletionCallBack
。BatchCompletionCallBack
已经调用了这个print\u progress
。如果设置了冗余（即并行（n\u jobs=2，verbose=100）
），虽然不如tqdm好，但print\u progress
将打印出已完成的任务。查看代码，print\u progress
是一个类方法，因此它已经有了self.n\u completed\u tasks
，它记录了我们想要的数量。我们所要做的只是将其与joblib的进度和仅当存在差异时更新
这是使用python 3.5在joblib 0.14.0和tqdm 4.46.0中测试的。
研究得很好，谢谢。我没有注意到回调属性。我发现joblib的文档非常有限。我必须深入研究这个回调类的源代码。我的问题是：调用\uu call\uuu
时，我可以自定义参数吗？（对整个并行类进行子分类可能是一种方法，但对我来说很重）。非常整洁。谢谢。我认为这实际上不是在监视正在运行的作业的完成情况，只是作业的排队。如果要插入时间。睡眠（1）
在myfun
开始时，您会发现TQM进度几乎立即完成，但结果需要几秒钟才能填充。是的，这部分是正确的。它跟踪作业开始与完成情况，但另一个问题是，在所有作业完成后，开销也会导致延迟。一旦所有任务都已完成。需要收集结果，这可能需要花费相当长的时间。我相信这个答案并不能真正回答问题。正如前面提到的，使用这种方法将跟踪队列，而不是执行本身。下面显示的回调方法似乎更精确
with tqdm_joblib(tqdm(desc="My calculation", total=10)) as progress_bar:
    Parallel(n_jobs=16)(delayed(sqrt)(i**2) for i in range(10))

import contextlib
import joblib
from tqdm import tqdm
from joblib import Parallel, delayed

@contextlib.contextmanager
def tqdm_joblib(tqdm_object):
    """Context manager to patch joblib to report into tqdm progress bar given as argument"""

    def tqdm_print_progress(self):
        if self.n_completed_tasks > tqdm_object.n:
            n_completed = self.n_completed_tasks - tqdm_object.n
            tqdm_object.update(n=n_completed)

    original_print_progress = joblib.parallel.Parallel.print_progress
    joblib.parallel.Parallel.print_progress = tqdm_print_progress

    try:
        yield tqdm_object
    finally:
        joblib.parallel.Parallel.print_progress = original_print_progress
        tqdm_object.close()

import time
def some_method(wait_time):
    time.sleep(wait_time)

with tqdm_joblib(tqdm(desc="My method", total=10)) as progress_bar:
    Parallel(n_jobs=2)(delayed(some_method)(0.2) for i in range(10))