Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/312.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python中更好的并行处理示例_Python_Parallel Processing_Python Multiprocessing_Concurrent.futures - Fatal编程技术网

Python中更好的并行处理示例

Python中更好的并行处理示例,python,parallel-processing,python-multiprocessing,concurrent.futures,Python,Parallel Processing,Python Multiprocessing,Concurrent.futures,我希望这次我没有被否决。我已经在Python中挣扎了一段时间(确切地说是2天)。我已经检查了这些资源(部分列表如下所示: (a) (b) 我来的不顺利。我想做的是: 大师: Break up the file into chunks(strings or numbers) Broadcast a pattern to be searched to all the workers Receive the offsets in the file where the pattern was found

我希望这次我没有被否决。我已经在Python中挣扎了一段时间(确切地说是2天)。我已经检查了这些资源(部分列表如下所示:

(a)

(b)

我来的不顺利。我想做的是:

大师:

Break up the file into chunks(strings or numbers)
Broadcast a pattern to be searched to all the workers
Receive the offsets in the file where the pattern was found
工人:

Receive pattern and chunk of text from the master
Compute()
Send back the offsets to the master.
我试图使用MPI/concurrent.futures/multiprocessing实现这一点,但失败了

我使用多处理模块的简单实现

import multiprocessing

filename = "file1.txt"
pat = "afow"
N = 1000

""" This is the naive string search algorithm"""

def search(pat, txt):

    patLen = len(pat)
    txtLen = len(txt)
    offsets = []

    # A loop to slide pattern[] one by one
    # Range generates numbers up to but not including that number
    for i in range ((txtLen - patLen) + 1):

    # Can not use a for loop here
    # For loops in C with && statements must be
    # converted to while statements in python
        counter = 0
        while(counter < patLen) and pat[counter] == txt[counter + i]:
           counter += 1
           if counter >= patLen:
               offsets.append(i)
        return str(offsets).strip('[]')

       """"
       This is what I want 
if __name__ == "__main__":
     tasks = []
     pool_outputs = []
     pool = multiprocessing.Pool(processes=5)
     with open(filename, 'r') as infile:
           lines = []
           for line in infile:
                lines.append(line.rstrip())
                if len(lines) > N:
                     pool_output = pool.map(search, tasks)
                     pool_outputs.append(pool_output)
                     lines = []
                if len(lines) > 0:
                     pool_output = pool.map(search, tasks)
                     pool_outputs.append(pool_output)
     pool.close()
     pool.join()
     print('Pool:', pool_outputs)
         """""

with open(filename, 'r') as infile:
    for line in infile:
        print(search(pat, line))
导入多处理
filename=“file1.txt”
pat=“afow”
N=1000
“”“这是简单的字符串搜索算法”“”
def搜索(pat,txt):
patLen=len(pat)
txtLen=len(txt)
偏移量=[]
#一个循环到幻灯片模式[]一个接一个
#范围生成的数字最多为但不包括该数字
对于范围内的i((txtLen-patLen)+1):
#这里不能使用for循环
#C中带有&&语句的For循环必须为
#转换为python中的while语句
计数器=0
而(计数器=patLen:
追加(一)
返回str(偏移量).strip(“[]”)
""""
这就是我想要的
如果名称=“\uuuuu main\uuuuuuuu”:
任务=[]
池_输出=[]
池=多处理。池(进程=5)
打开(文件名为“r”)作为填充:
行=[]
对于填充中的线:
line.append(line.rstrip())
如果len(行)>N:
pool\u output=pool.map(搜索、任务)
池输出。追加(池输出)
行=[]
如果len(线)>0:
pool\u output=pool.map(搜索、任务)
池输出。追加(池输出)
pool.close()
pool.join()
打印('池:',池输出)
"""""
打开(文件名为“r”)作为填充:
对于填充中的线:
打印(搜索(pat,line))
我将非常感谢您的指导,特别是与concurrent.futures的指导。感谢您的时间。Valeriy帮助我添加了他,我为此感谢他

但是如果有人能让我放纵一下的话,这就是我为concurrent.futures编写的代码(根据我在某处看到的一个示例编写)

来自concurrent.futures导入ProcessPoolExecutor的
,已完成
输入数学
def搜索(pat,txt):
patLen=len(pat)
txtLen=len(txt)
偏移量=[]
#一个循环到幻灯片模式[]一个接一个
#范围生成的数字最多为但不包括该数字
对于范围内的i((txtLen-patLen)+1):
#这里不能使用for循环
#C中带有&&语句的For循环必须为
#转换为python中的while语句
计数器=0
而(计数器=patLen:
追加(一)
返回str(偏移量).strip(“[]”)
#检查字符串列表
def分块_工作程序(行):
返回{0:search(“fmo”,line)以查找行中的行}
def pool_bruteforce(文件名,NPROC):
行=[]
打开(文件名)为f时:
lines=[line.rstrip('\n')表示f中的行]
chunksize=int(math.ceil(len(行)/float(nprocs)))
期货=[]
以ProcessPoolExecutor()作为执行器:
对于范围内的i(NPROC):
chunk=行[(chunksize*i):(chunksize*(i+1))]
futures.append(executor.submit(chunked_worker,chunk))
结果ct={}
对于已完成的f(期货):
resultDisct.update(f.result())
返回结果
filename=“file1.txt”
pool_bruteforce(文件名,5)

再次感谢Valeriy和任何试图帮助我解开谜语的人。

您使用了几个论点,因此:

import multiprocessing
from functools import partial
filename = "file1.txt"
pat = "afow"
N = 1000

""" This is the naive string search algorithm"""

def search(pat, txt):
    patLen = len(pat)
    txtLen = len(txt)
    offsets = []

    # A loop to slide pattern[] one by one
    # Range generates numbers up to but not including that number
    for i in range ((txtLen - patLen) + 1):

    # Can not use a for loop here
    # For loops in C with && statements must be
    # converted to while statements in python
        counter = 0
        while(counter < patLen) and pat[counter] == txt[counter + i]:
           counter += 1
           if counter >= patLen:
               offsets.append(i)
        return str(offsets).strip('[]')


if __name__ == "__main__":
     tasks = []
     pool_outputs = []
     pool = multiprocessing.Pool(processes=5)
     lines = []
     with open(filename, 'r') as infile:
         for line in infile:
             lines.append(line.rstrip())                 
     tasks = lines
     func = partial(search, pat)
     if len(lines) > N:
        pool_output = pool.map(func, lines )
        pool_outputs.append(pool_output)     
     elif len(lines) > 0:
        pool_output = pool.map(func, lines )
        pool_outputs.append(pool_output)
     pool.close()
     pool.join()
     print('Pool:', pool_outputs)
导入多处理
从functools导入部分
filename=“file1.txt”
pat=“afow”
N=1000
“”“这是简单的字符串搜索算法”“”
def搜索(pat,txt):
patLen=len(pat)
txtLen=len(txt)
偏移量=[]
#一个循环到幻灯片模式[]一个接一个
#范围生成的数字最多为但不包括该数字
对于范围内的i((txtLen-patLen)+1):
#这里不能使用for循环
#C中带有&&语句的For循环必须为
#转换为python中的while语句
计数器=0
而(计数器=patLen:
追加(一)
返回str(偏移量).strip(“[]”)
如果名称=“\uuuuu main\uuuuuuuu”:
任务=[]
池_输出=[]
池=多处理。池(进程=5)
行=[]
打开(文件名为“r”)作为填充:
对于填充中的线:
line.append(line.rstrip())
任务=行
func=部分(搜索,pat)
如果len(行)>N:
pool_output=pool.map(func,行)
池输出。追加(池输出)
elif len(线)>0:
pool_output=pool.map(func,行)
池输出。追加(池输出)
pool.close()
pool.join()
打印('池:',池输出)

Valeriy:谢谢。partial做了什么?你知道有什么资源可以彻底解决python中的并行处理问题吗?再次感谢。Valeriy:我读过,但不能真正理解。很抱歉,我的意思是作为函数中的适当示例。谢谢。
import multiprocessing
from functools import partial
filename = "file1.txt"
pat = "afow"
N = 1000

""" This is the naive string search algorithm"""

def search(pat, txt):
    patLen = len(pat)
    txtLen = len(txt)
    offsets = []

    # A loop to slide pattern[] one by one
    # Range generates numbers up to but not including that number
    for i in range ((txtLen - patLen) + 1):

    # Can not use a for loop here
    # For loops in C with && statements must be
    # converted to while statements in python
        counter = 0
        while(counter < patLen) and pat[counter] == txt[counter + i]:
           counter += 1
           if counter >= patLen:
               offsets.append(i)
        return str(offsets).strip('[]')


if __name__ == "__main__":
     tasks = []
     pool_outputs = []
     pool = multiprocessing.Pool(processes=5)
     lines = []
     with open(filename, 'r') as infile:
         for line in infile:
             lines.append(line.rstrip())                 
     tasks = lines
     func = partial(search, pat)
     if len(lines) > N:
        pool_output = pool.map(func, lines )
        pool_outputs.append(pool_output)     
     elif len(lines) > 0:
        pool_output = pool.map(func, lines )
        pool_outputs.append(pool_output)
     pool.close()
     pool.join()
     print('Pool:', pool_outputs)