Python 为什么mp.Pool（）.map（）比ProcessPoolExecutor（）.map（）慢_Python_Multiprocessing_Pool_Concurrent.futures

Python 为什么mp.Pool（）.map（）比ProcessPoolExecutor（）.map（）慢

python

Python 为什么mp.Pool（）.map（）比ProcessPoolExecutor（）.map（）慢,python,multiprocessing,pool,concurrent.futures,Python,Multiprocessing,Pool,Concurrent.futures,我有一段愚蠢的代码来解释我在工作中遇到的一种行为： from multiprocessing import Pool from concurrent.futures import ProcessPoolExecutor from struct import pack from time import time def packer(integer): return pack('i', integer) if __name__=='__main__': pool1 =

我有一段愚蠢的代码来解释我在工作中遇到的一种行为：

from multiprocessing import Pool
from concurrent.futures import ProcessPoolExecutor
from struct import pack
from time import time


def packer(integer):

    return pack('i', integer)


if __name__=='__main__':

    pool1 = Pool()
    pool2 = ProcessPoolExecutor()

    nums = list(range(10**4))

    start = time()
    res1 = pool1.map(packer, nums)
    print (f'total mp pool: {time() - start}')

    start = time()
    res2 = pool2.map(packer, nums)
    print (f'total futures pool: {time() - start}')

    pool1.close()

我得到（Python 3.8.1）：

在工作中，我将代码从

mp.Pool（）

修改为

concurrent.futures

，以允许在进程和线程之间移动

然后，我发现异常传播在

concurrent.futures

中非常可怕。回到

mp.Pool（）

，我发现性能有所下降

我知道

concurrent.futures.ProcessPoolExecutor

应该是一个更高级别的API，它比

mp.Pool（）快多少
我看到ProcessPoolExecutor.map
只是：
super().map(partial(_process_chunk, fn),
                              _get_chunks(*iterables, chunksize=chunksize),
                              timeout=timeout)

其中super
是：
这就是我迷路的地方
mp.Pool
和ProcessPoolExecutor
是否进入不同的兔子洞？通过手动调用mp
/Pool
/map
，使用正确的参数，是否可以从ProcessPoolExecutor
获取“好东西”？
您显示的计时不支持您的声明，Pool.map（）
花费的时间更少。顺便说一句，您可以使用多处理.dummy.Pool来使用线程而不是进程。您是对的，我必须更准确地重述我在工作中的经历
super().map(partial(_process_chunk, fn),
                              _get_chunks(*iterables, chunksize=chunksize),
                              timeout=timeout)

def map(self, fn, *iterables, timeout=None, chunksize=1):
    """Returns an iterator equivalent to map(fn, iter).
    Args:
        fn: A callable that will take as many arguments as there are
            passed iterables.
        timeout: The maximum number of seconds to wait. If None, then there
            is no limit on the wait time.
        chunksize: The size of the chunks the iterable will be broken into
            before being passed to a child process. This argument is only
            used by ProcessPoolExecutor; it is ignored by
            ThreadPoolExecutor.
    Returns:
        An iterator equivalent to: map(func, *iterables) but the calls may
        be evaluated out-of-order.
    Raises:
        TimeoutError: If the entire result iterator could not be generated
            before the given timeout.
        Exception: If fn(*args) raises for any values.
    """
    if timeout is not None:
        end_time = timeout + time.monotonic()

    fs = [self.submit(fn, *args) for args in zip(*iterables)]

    # Yield must be hidden in closure so that the futures are submitted
    # before the first iterator value is required.
    def result_iterator():
        try:
            # reverse to keep finishing order
            fs.reverse()
            while fs:
                # Careful not to keep a reference to the popped future
                if timeout is None:
                    yield fs.pop().result()
                else:
                    yield fs.pop().result(end_time - time.monotonic())
        finally:
            for future in fs:
                future.cancel()
    return result_iterator()