Python 并发问题:为什么只有一名工人?
我正在试验使用Python 并发问题:为什么只有一名工人?,python,python-3.x,concurrency,multiprocessing,concurrent.futures,Python,Python 3.x,Concurrency,Multiprocessing,Concurrent.futures,我正在试验使用concurrent.futures.ProcessPoolExecutor来并行化串行任务。串行任务涉及从数字范围中查找给定数字的出现次数。我的代码如下所示在执行过程中,我从Task Manager/System Monitor/top注意到,尽管为processPoolExecutor的max_worker指定了一个大于1的值,但只有一个cpu/线程持续运行。为什么会这样?如何使用concurrent.futures并行化我的代码?我的代码是用python 3.5执行的 imp
concurrent.futures.ProcessPoolExecutor
来并行化串行任务。串行任务涉及从数字范围中查找给定数字的出现次数。我的代码如下所示在执行过程中,我从Task Manager/System Monitor/top注意到,尽管为
processPoolExecutor
的max_worker指定了一个大于1的值,但只有一个cpu/线程持续运行。为什么会这样?如何使用concurrent.futures并行化我的代码?
我的代码是用python 3.5执行的
import concurrent.futures as cf
from time import time
def _findmatch(nmax, number):
print('def _findmatch(nmax, number):')
start = time()
match=[]
nlist = range(nmax)
for n in nlist:
if number in str(n):match.append(n)
end = time() - start
print("found {} in {}sec".format(len(match),end))
return match
def _concurrent(nmax, number, workers):
with cf.ProcessPoolExecutor(max_workers=workers) as executor:
start = time()
future = executor.submit(_findmatch, nmax, number)
futures = future.result()
found = len(futures)
end = time() - start
print('with statement of def _concurrent(nmax, number):')
print("found {} in {}sec".format(found, end))
return futures
if __name__ == '__main__':
match=[]
nmax = int(1E8)
number = str(5) # Find this number
workers = 3
start = time()
a = _concurrent(nmax, number, workers)
end = time() - start
print('main')
print("found {} in {}sec".format(len(a),end))
运行您的代码会显示所有三个工作人员都在那里,但其中两个正在睡觉。问题是,
executor.submit(\u findmatch,nmax,number)
只告诉一个工人执行函数\u findmatch
我不明白你的代码在做什么,但基本上你也需要这样做
- 将任务分成三个偶数部分,并使用执行器将每个部分发送到流程。提交
- 将任务分成更小的块(假设一个包含100个元素的块)并使用,以便每个
只获取分配给它的块\u findmatch
import concurrent.futures as cf
from time import time
def _findmatch(nmin, nmax, number):
print('def _findmatch', nmin, nmax, number)
start = time()
count = 0
for n in range(nmin, nmax):
if number in str(n):
count += 1
end = time() - start
print("found {} in {}sec".format(count,end))
return count
def _concurrent(nmax, number, workers):
with cf.ProcessPoolExecutor(max_workers=workers) as executor:
start = time()
chunk = nmax // workers
futures = []
for i in range(workers):
cstart = chunk * i
cstop = chunk * (i + 1) if i != workers - 1 else nmax
futures.append(executor.submit(_findmatch, cstart, cstop, number))
cf.wait(futures)
res = sum(f.result() for f in futures)
end = time() - start
print('with statement of def _concurrent(nmax, number):')
print("found {} in {}sec".format(res, end))
return res
if __name__ == '__main__':
match=[]
nmax = int(1E8)
number = str(5) # Find this number
workers = 3
start = time()
a = _concurrent(nmax, number, workers)
end = time() - start
print('main')
print("found {} in {}sec".format(a,end))
输出:
def _findmatch 0 33333333 5
def _findmatch 33333333 66666666 5
def _findmatch 66666666 100000000 5
found 17190813 in 20.09431290626526sec
found 17190813 in 20.443560361862183sec
found 22571653 in 20.47660517692566sec
with statement of def _concurrent(nmax, number):
found 56953279 in 20.6196870803833sec
main
found 56953279 in 20.648695707321167sec
谢谢在我领会你的建议时,我有一个问题。为什么需要手动创建块?
concurrent.futures.ProcessPoolExecutor
假设要在它的工作人员池中自动拆分解决给定函数的工作吗?@SunBear:作为程序员,您的工作是将工作拆分为可由工作人员独立运行的块ProcessPoolExecutor
负责调用由工作者运行的块。请注意,在本例中,我没有将任务拆分为三个块,而是将其拆分为10个不同的任务,最终结果将是相同的(当然,控制台输出将不同,因为\u findmatch
将运行10次)。感谢您的指点。我已经重写了代码,以输出一个包含出现的数字的列表。我将在下一个问题中发布它,在这里我将它的性能与executor.map()
进行了比较。我已经将.submit()
和.map()
与串行代码进行了基准测试。如果你有时间,请评论。