Python 如何从pool.apply\u异步调用中累积结果?
我想对pool.apply_async(func)进行调用,并在结果可用时立即累积结果,而无需彼此等待Python 如何从pool.apply\u异步调用中累积结果?,python,asynchronous,multiprocessing,pool,Python,Asynchronous,Multiprocessing,Pool,我想对pool.apply_async(func)进行调用,并在结果可用时立即累积结果,而无需彼此等待 导入多处理 将numpy作为np导入 chrNames=['chr1','chr2','chr3'] 模拟人生=[1,2,3] def累计结果(累计结果、累计信号数组、累计计数数组): signalArray=chrBased\u simBased\u结果[0] countArray=chrBased\u simBased\u结果[1] 累计信号阵列+=信号阵列 累计countArray+=
导入多处理
将numpy作为np导入
chrNames=['chr1','chr2','chr3']
模拟人生=[1,2,3]
def累计结果(累计结果、累计信号数组、累计计数数组):
signalArray=chrBased\u simBased\u结果[0]
countArray=chrBased\u simBased\u结果[1]
累计信号阵列+=信号阵列
累计countArray+=countArray
def func(chrName,simNum):
打印(“%s%d%”(chrName,simNum))
结果=[]
信号数组=np.full((10000,),simNum,dtype=float)
count_array=np.full((10000,),simNum,dtype=int)
结果追加(信号_数组)
result.append(count\u数组)
返回结果
如果uuuu name uuuuuu='\uuuuuuu main\uuuuuuu':
累计信号数组=np.0((10000,),数据类型=浮点)
累计CountArray=np.0((10000,),数据类型=int)
numofProcesses=multiprocessing.cpu\u count()
池=多处理。池(numofProcesses)
对于chrName中的chrName:
对于sims中的simNum:
结果=池。应用异步(func,(chrName,simNum,)
累计基于CHRBASE\U SIMBASE\U结果(result.get()、累计信号数组、累计计数数组)
pool.close()
pool.join()
打印(累计信号阵列)
打印(累计计数数组)
这样,每个pool.apply\u异步调用都会等待其他调用结束。
有没有办法摆脱这种等待状态?您在每次迭代中都使用
result.get()
,并使主进程等待函数准备就绪
请在下面找到一个工作版本,打印显示在“func”准备就绪时完成了累加,并添加了随机休眠以确保较大的执行时间差异
import multiprocessing
import numpy as np
from time import time, sleep
from random import random
chrNames=['chr1','chr2','chr3']
sims=[1,2,3]
def accumulate_chrBased_simBased_result(chrBased_simBased_result,accumulatedSignalArray,accumulatedCountArray):
signalArray = chrBased_simBased_result[0]
countArray = chrBased_simBased_result[1]
accumulatedSignalArray += signalArray
accumulatedCountArray += countArray
def func(chrName,simNum):
result=[]
sleep(random()*5)
signal_array=np.full((10000,), simNum, dtype=float)
count_array = np.full((10000,), simNum, dtype=int)
result.append(signal_array)
result.append(count_array)
print('%s %d' %(chrName,simNum))
return result
if __name__ == '__main__':
accumulatedSignalArray = np.zeros((10000,), dtype=float)
accumulatedCountArray = np.zeros((10000,), dtype=int)
numofProcesses = multiprocessing.cpu_count()
pool = multiprocessing.Pool(numofProcesses)
results = []
for chrName in chrNames:
for simNum in sims:
results.append(pool.apply_async(func, (chrName,simNum,)))
for i in results:
print(i)
while results:
for r in results[:]:
if r.ready():
print('{} is ready'.format(r))
accumulate_chrBased_simBased_result(r.get(),accumulatedSignalArray,accumulatedCountArray)
results.remove(r)
pool.close()
pool.join()
print(accumulatedSignalArray)
print(accumulatedCountArray)
非常感谢你。我唯一关心的是结果的内存使用。当我们有很多chrname和sim时,结果会占用很多空间,对吗?有没有办法累积池的结果。尽可能应用异步调用,而不将它们收集到类似列表的结构中?因为这可能会减少内存占用。我认为您不会遇到这样的问题,因为apply\u async实际上返回:
class multiprocessing.pool.AsyncResult
。您需要在AsyncResult上调用get()来实际获取值。