Python for循环中的多处理_Python_Multiprocessing_Concurrent.futures

Python for循环中的多处理

python

Python for循环中的多处理,python,multiprocessing,concurrent.futures,Python,Multiprocessing,Concurrent.futures,我有下面的matching（）函数和for循环，我将一个大的生成器（独特的组合）传递给它这需要几天的时间来处理，所以我想对循环中的元素使用多处理来加快速度，但我就是不知道怎么做我发现很难理解并发期货的逻辑 results = [] match_score = [] def matching(): for pair in unique_combinations: if fuzz.ratio(pair[0]

我有下面的

matching（）

函数和for循环，我将一个大的

生成器（独特的组合）

传递给它

这需要几天的时间来处理，所以我想对循环中的元素使用多处理来加快速度，但我就是不知道怎么做

我发现很难理解并发期货的逻辑

    results = []
    match_score = []

    def matching():    
        for pair in unique_combinations:        
            if fuzz.ratio(pair[0], pair[1]) > 90:    
                results.append(pair)    
                match_score.append(fuzz.ratio(pair[0], pair[1]))

    def main():    
        executor = ProcessPoolExecutor(max_workers=3)    
        task1 = executor.submit(matching)    
        task2 = executor.submit(matching)    
        task3 = executor.submit(matching)

    if __name__ == '__main__':
        main()

print(results)
print(match_score)

我认为这应该会加快执行速度。

如果您已经在使用concurrent.futures，最好的方法是使用map:

import concurrent.futures

def matching(pair):
    fuzz_ratio = fuzz.ratio(pair[0], pair[1])  # only calculate this once
    if fuzz_ratio  > 90:    
        return pair, fuzz_ratio
    else:
        return None


def main():
    unique_combinations = [(1, 2), (2, 3), (3, 4)]
    with concurrent.futures.ProcessPoolExecutor(max_workers=4) as executor:
        for result in executor.map(matching, unique_combinations, chunksize=100):
            if result:
                # handle the results somehow
                results.append(result[0])
                match_score.append(results[1])


if __name__ == '__main__':
    main()

处理结果的方法有很多，但要点是从

matching

返回一个值，然后在

executor.map

中为

main

中的循环检索该值。Docs.

非常感谢这正是我所需要的。当我试图运行代码时，却被打断了进程池：进程池中的一个进程在未来运行或挂起时突然终止。我试过改变最大工人数，但没有帮助。有什么想法吗？@cnns一个谷歌建议可能是因为你没有将你的代码包装在

main

和

if.\uu name.\uuu…

中。它与Windows如何分叉进程（或无法分叉进程或其他）有关。