在python中并行化嵌套for循环以查找最大值_Python_Parallel Processing_Multiprocessing_Python Multithreading_Python Multiprocessing

在python中并行化嵌套for循环以查找最大值

python parallel-processing

在python中并行化嵌套for循环以查找最大值,python,parallel-processing,multiprocessing,python-multithreading,python-multiprocessing,Python,Parallel Processing,Multiprocessing,Python Multithreading,Python Multiprocessing,我一直在努力提高这段代码的执行时间。由于计算非常耗时，我认为最好的解决方案是将代码并行化。输出也可以存储在内存中，然后写入文件我对Python和parallelism都是新手，因此我发现很难应用所解释的概念。我也发现了这个问题，但我无法想出如何在我的情况下实现同样的方法。我在Windows平台上工作，使用Python 3.4 for i in range(0, len(unique_words)): max_similarity = 0 max_simila

我一直在努力提高这段代码的执行时间。由于计算非常耗时，我认为最好的解决方案是将代码并行化。输出也可以存储在内存中，然后写入文件

我对Python和parallelism都是新手，因此我发现很难应用所解释的概念。我也发现了这个问题，但我无法想出如何在我的情况下实现同样的方法。我在Windows平台上工作，使用Python 3.4

for i in range(0, len(unique_words)):
    max_similarity = 0        
    max_similarity_word = ""
    for j in range(0, len(unique_words)):
        if not i == j:
            similarity = calculate_similarity(global_map[unique_words[i]], global_map[unique_words[j]])
            if similarity > max_similarity:
                 max_similarity = similarity
                 max_similarity_word = unique_words[j]
    file_co_occurring.write(
        unique_words[i] + "\t" + max_similarity_word + "\t" + str(max_similarity) + "\n")

如果您需要代码的解释：

```
unique_words
```
是一个单词（字符串）列表
```
global\u-map
```
是一个字典，其关键字是单词（
```
global\u-map.keys（）
```
包含与
```
unique\u-words
```
相同的元素），值是以下格式的字典：{word:value}，其中单词是
```
unique\u-words
```
中值的子集
对于每个单词，我根据其在
```
global\u map
```
中的值查找最相似的单词。我不想把每个相似性都存储在内存中，因为地图已经占用了太多的时间
```
计算相似度
```
返回0到1之间的值
对于
```
unique\u words
```
中的每个单词，结果应该包含最相似的单词（最相似的单词应该与单词本身不同，这就是为什么我添加了条件
```
if not I==j
```
，但如果我检查
```
max\u similarity
```
是否不同于1，也可以这样做）
如果单词的
```
max\u相似度
```
为0，则如果最相似的单词是空字符串，则可以

from concurrent.futures import ThreadPoolExecutor, Future
from itertools import permutations
from collections import namedtuple, defaultdict

Result = namedtuple('Result', ('value', 'word'))

def new_calculate_similarity(word1, word2):
    return Result(
        calculate_similarity(global_map[word1], global_map[word2]),
        word2)

with ThreadPoolExecutor(max_workers=4) as executer:
    futures = defaultdict(list)
    for word1, word2 in permutations(unique_words, r=2):
            futures[word1].append(
                executer.submit(new_calculate_similarity, word1, word2))

    for word in futures:
        # this will block until all calculations have completed for 'word'
        results = map(Future.result, futures[word])
        max_result = max(results, key=lambda r: r.value) 
        print(word, max_result.word, max_result.value, 
            sep='\t', 
            file=file_co_occurring)

calculate\u similarity