在Python多处理中处理多个结果_Python_Multiprocessing

在Python多处理中处理多个结果

python

在Python多处理中处理多个结果,python,multiprocessing,Python,Multiprocessing,我正在编写一段Python代码，使用多处理功能解析大量ascii文件。对于每个文件，我必须执行此函数的操作 def parse_file(file_name): record = False path_include = [] buffer_include = [] include_file_filters = {} include_keylines = {} grids_lines = [] mat_name_lines = []

我正在编写一段Python代码，使用多处理功能解析大量ascii文件。对于每个文件，我必须执行此函数的操作

def parse_file(file_name):
    record = False
    path_include = []
    buffer_include = []
    include_file_filters = {}
    include_keylines = {}
    grids_lines = []
    mat_name_lines = []
    pids_name_lines = []
    pids_shell_lines= []
    pids_weld_lines = []
    shells_lines = []
    welds_lines = []
    with open(file_name, 'rb') as in_file:
        for lineID, line in enumerate(in_file):
            if record:
                path_include += line
            if record and re.search(r'[\'|\"]$', line.strip()):
                buffer_include.append(re_path_include.search(
                    path_include).group(1).replace('\n', ''))
                record = False
            if 'INCLUDE' in line and '$' not in line:
                if re_path_include.search(line):
                    buffer_include.append(
                        re_path_include.search(line).group(1))
                else:
                    path_include = line
                    record = True
            if line.startswith('GRID'):
                grids_lines += [lineID]
            if line.startswith('$HMNAME MAT'):
                mat_name_lines += [lineID]
            if line.startswith('$HMNAME PROP'):
                pids_name_lines += [lineID]
            if line.startswith('PSHELL'):
                pids_shell_lines += [lineID]
            if line.startswith('PWELD'):
                pids_weld_lines += [lineID]
            if line.startswith(('CTRIA3', 'CQUAD4')):
                shells_lines += [lineID]
            if line.startswith('CWELD'):
                welds_lines += [lineID]
    include_keylines = {'grid': grids_lines, 'mat_name': mat_name_lines, 'pid_name': pids_name_lines, \
                        'pid_shell': pids_shell_lines, 'pid_weld': pids_weld_lines, 'shell': shells_lines, 'weld': welds_lines}
    include_file_filters = {file_name: include_keylines}
    return buffer_include, include_file_filters

此函数以这种方式在文件列表中循环使用（CPU上的每个进程解析一个完整的文件）

上面使用的

grouper

函数定义为

def grouper(iterable, padvalue=None):
    return itertools.izip_longest(*[iter(iterable)]*mp.cpu_count(), fillvalue=padvalue)

我在cpu中使用的是4核的Python 2.7.15（Intel Core i3-6006U）

当我运行我的代码时，我看到所有CPU都在100%运行，Python控制台中的输出是

Running:MainProcess（）

，但没有其他的结果。似乎我的代码在指令

results=p.map（parse_file，include）

处被阻塞，无法继续（当我一次解析一个文件而不进行并行化时，代码运行良好）

怎么了
如何处理
```
parse_file
```
函数给出的结果在并行执行期间？我的方法是否正确

提前感谢您的支持

编辑

谢谢darc的回复。我试过你的建议，但问题是一样的。如果我像这样把代码放在if语句下，问题似乎就解决了

if __name__ == '__main__':

这可能是由于pythonidle处理流程的方式。出于开发和调试的原因，我使用空闲环境。

根据python：

映射（func，iterable[，chunksize]） map（）内置函数的并行等价物（尽管它只支持一个iterable参数）。它将阻塞，直到结果就绪

此方法将iterable拆分为若干块，并将其作为单独的任务提交给流程池。可以通过将chunksize设置为正整数来指定这些块的（近似）大小

因为它会阻塞您的进程，所以请等待解析文件完成

由于map已经更改了iterable，您可以尝试将所有包含作为一个大iterable一起发送

import multiprocessing as mp
p = mp.Pool(mp.cpu_count())
buffer_include = []
include_file_filters = {}
results = p.map(parse_file, list_of_file_path, 1) 
buffer_include += results[0]
include_file_filters.update(results[1])
p.close()

如果要保持原始循环，请使用apply_async，或者如果使用python3，则可以使用ProcessPoolExecutor submit（）函数并读取结果

import multiprocessing as mp
p = mp.Pool(mp.cpu_count())
buffer_include = []
include_file_filters = {}
results = p.map(parse_file, list_of_file_path, 1) 
buffer_include += results[0]
include_file_filters.update(results[1])
p.close()