Python concurrent.futures.ProcessPoolExecutor()未在类内调用方法
我试图在类中使用concurrent.futures.ProcessPoolExecutor()。但是负责调用方法操作的方法操作\u data没有调用它。未打印“内部并发插入”和“执行某些操作”Python concurrent.futures.ProcessPoolExecutor()未在类内调用方法,python,python-3.x,multithreading,multiprocessing,Python,Python 3.x,Multithreading,Multiprocessing,我试图在类中使用concurrent.futures.ProcessPoolExecutor()。但是负责调用方法操作的方法操作\u data没有调用它。未打印“内部并发插入”和“执行某些操作” import concurrent.futures class ConcurrentTest: def operation(self, chunk): print("inside concurrent insert") print(&quo
import concurrent.futures
class ConcurrentTest:
def operation(self, chunk):
print("inside concurrent insert")
print("do some operations")
def operation_data(self):
chuck_list = [chunk for chunk in self.chunks([millions_of_data_in_this_list], 10000)]
print('chunk list done')
with concurrent.futures.ProcessPoolExecutor() as executor:
for c in chuck_list:
executor.submit(operation, c)
print("done")
@staticmethod
def chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i + n]
concurrent_test = ConcurrentTest()
concurrent_test.operation_data()
我做错什么了吗?请建议
提前谢谢 更改
执行者。提交(操作,c)
至
executor.submit(self.operation, c)
或者改用
注意,使用map()
executor.map(self.operation, your_big_list, chunksize=n)
编辑:正如@Booboo在评论中指出的那样,您将无法获得并行化
在提交后立即调用future.result(),因此我相应地编辑了答案
EDIT2:下面是executor.map的完整示例,其中删除了所有不必要的代码,并使用chunksize
参数
import concurrent.futures
class ConcurrentTest:
def operation(self, chunk):
print(f"{chunk} inside concurrent insert")
print(f"do some operations with {chunk}")
def operation_data(self):
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.map(self.operation, range(20), chunksize=3)
print("done")
concurrent_test = ConcurrentTest()
concurrent_test.operation_data()
输出(您的可能不同):
如果您调用submit
,然后立即对返回的Future
实例调用result()
,您将不会得到并行化。@Booboo我不知道您为什么这样认为,但即使是。此外,我还测试了它(从一个长长的列表中打印数字),很明显,它们没有按顺序打印。如果您认为有必要,请提出改进建议。好吧,对于初学者来说,该示例的池大小为1,并且您无论如何都无法使用这样的池大小实现并行化,请再次查看该示例。如果您在一个循环中提交一个任务(使用submit
),然后等待结果(使用future.result()
),然后再执行下一个submit
,那么无论您的池大小,您怎么可能一次运行多个任务?@Booboo,好的,我明白您的意思。感谢您的澄清,OP最初的内容,根本不是等待未来,实际上在某种意义上更好。with
块将以隐式的关闭(wait=True)
终止,这将等待所有未完成的任务完成。当然,您永远不会以这种方式检测到任何异常。即使未使用带
块的,程序也不会终止,直到所有未完成的任务完成。当然,另一种选择是将所有未来保存在一个列表中,然后对每个未来调用result
。但是,map
使用合适的chunksize参数是一个更好的选择。
import concurrent.futures
class ConcurrentTest:
def operation(self, chunk):
print(f"{chunk} inside concurrent insert")
print(f"do some operations with {chunk}")
def operation_data(self):
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.map(self.operation, range(20), chunksize=3)
print("done")
concurrent_test = ConcurrentTest()
concurrent_test.operation_data()
0 inside concurrent insert
do some operations with 0
1 inside concurrent insert
do some operations with 1
--- skipped for brevity ---
13 inside concurrent insert
do some operations with 13
14 inside concurrent insert
do some operations with 14
18 inside concurrent insert
do some operations with 18
19 inside concurrent insert
do some operations with 19
15 inside concurrent insert
11 inside concurrent insert
do some operations with 15
16 inside concurrent insert
do some operations with 16
do some operations with 11
17 inside concurrent insert
do some operations with 17
done