如果在python中使用多处理,但函数不返回任何内容,是否需要调用get()?

如果在python中使用多处理,但函数不返回任何内容,是否需要调用get()?,python,python-multiprocessing,Python,Python Multiprocessing,我想利用python的多处理模块来并行化这个简单的示例: import numpy as np import h5py import os import matplotlib.pyplot as plt from multiprocessing import Pool def load_array(path, variable): try: return np.array(h5py.File(path, "r").get(variable)) except:

我想利用python的多处理模块来并行化这个简单的示例:

import numpy as np
import h5py
import os
import matplotlib.pyplot as plt
from multiprocessing import Pool

def load_array(path, variable):
    try:
        return np.array(h5py.File(path, "r").get(variable))
    except:
        raise FileNotFoundError("Corrupted file: {}".format(path))

def mat2img(rootdir, save_path, variable):

    fig = plt.figure()

    print("Processing " + rootdir)

    for subdir, dirs, files in os.walk(rootdir):
        for file in files:
            arr = load_array(os.path.join(subdir, file), variable).T

            fig.subplots_adjust(top=1, bottom=0, right=1, left=0)
            plt.pcolormesh(np.arange(0, arr.shape[1]), np.arange(0, arr.shape[0]), arr, cmap="jet")
            plt.axis("off")
            plt.savefig(os.path.join(save_path, subdir.split(os.path.sep)[-1], file + ".jpg"))
            plt.clf()

if __name__ == '__main__':

    with Pool(processes=3) as pool:
        pool.apply_async(mat2img, ("O:\\data1", "O:\\spectrograms", "spectrum"))
        pool.apply_async(mat2img, ("O:\\data2", "O:\\spectrograms", "spectrum"))
        pool.apply_async(mat2img, ("O:\\data3", "O:\\spectrograms", "spectrum"))

但是,这并不起任何作用,因为
apply\u async
没有调用任何函数。从中,我看到每个
apply\u async
都分配给某个变量
res
。即使我的函数不返回任何内容,我也需要这样做吗?如果是这样,那么该变量
res
包含什么?调用
get()
会得到什么?我在哪里犯的错误?

您使用
appy\u async
安排作业。然后你必须等到他们完成。如果你不等待,他们甚至不会开始

with Pool(processes=3) as pool:
    pool.apply_async(mat2img, ("O:\\data1", "O:\\spectrograms", "spectrum"))
    pool.apply_async(mat2img, ("O:\\data2", "O:\\spectrograms", "spectrum"))
    pool.apply_async(mat2img, ("O:\\data3", "O:\\spectrograms", "spectrum"))
    pool.close()  # Do not accept any more jobs.
    pool.join(timeout=1000)  # Wait until all async jobs complete.
或者,您可以
.get()
确保每个作业完成:

with Pool(processes=3) as pool:
    # Schedule the jobs.
    jobs = [pool.apply_async(mat2img, (dest, "O:\\spectrograms", "spectrum"))
            for dest in ("O:\\data1", "O:\\data2", "O:\\data3")]
    # Wait for the jobs to complete.
    for job in jobs:
        job.get(timeout=100)

正如@AndreaCorbellini正确指出的,如果您的作业没有返回任何您关心的结果,您可以执行
job.wait()
而不是
job.get()

您可以使用
appy\u async
计划作业。然后你必须等到他们完成。如果你不等待,他们甚至不会开始

with Pool(processes=3) as pool:
    pool.apply_async(mat2img, ("O:\\data1", "O:\\spectrograms", "spectrum"))
    pool.apply_async(mat2img, ("O:\\data2", "O:\\spectrograms", "spectrum"))
    pool.apply_async(mat2img, ("O:\\data3", "O:\\spectrograms", "spectrum"))
    pool.close()  # Do not accept any more jobs.
    pool.join(timeout=1000)  # Wait until all async jobs complete.
或者,您可以
.get()
确保每个作业完成:

with Pool(processes=3) as pool:
    # Schedule the jobs.
    jobs = [pool.apply_async(mat2img, (dest, "O:\\spectrograms", "spectrum"))
            for dest in ("O:\\data1", "O:\\data2", "O:\\data3")]
    # Wait for the jobs to complete.
    for job in jobs:
        job.get(timeout=100)

正如@AndreaCorbellini正确指出的那样,如果您的作业没有返回任何您关心的结果,那么您可以执行
job.wait()
而不是
job.get()

因此我必须调用
get
,无论我的函数是否返回某些结果,thanks@Colonder:是,
.get()
的意思是“我确实想运行此任务”. 有时您不会这样做,例如,当您想提前停止计算,甚至不想启动一些排队的任务时。@9000:如您所述,在不使用
的情况下调用
.join()
。无论您调用
.get()
或not@AndreaCorbellini:我想
.join()
调用
.get()
,只是忽略结果。@9000:有点,但我想澄清的是,无论我的函数是否返回内容,我都必须调用
get
,thanks@Colonder:是的,
.get()
的意思是“我实际上想运行此任务”。有时您不会这样做,例如,当您想提前停止计算,甚至不想启动一些排队的任务时。@9000:如您所述,在不使用
的情况下调用
.join()
。无论您调用
.get()
或not@AndreaCorbellini:我想
.join()
调用
.get()
,然后忽略结果。@9000:有点,但我想澄清的是