Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/kotlin/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python multiprocessing.Pool.map静默死亡_Python_Error Handling_Multiprocessing - Fatal编程技术网

Python multiprocessing.Pool.map静默死亡

Python multiprocessing.Pool.map静默死亡,python,error-handling,multiprocessing,Python,Error Handling,Multiprocessing,我已经尝试将for循环并行放置,以加快某些代码的速度。考虑这一点: from multiprocessing import Pool results = [] def do_stuff(str): print str results.append(str) p = Pool(4) p.map(do_stuff, ['str1','str2','str3',...]) # many strings here ~ 2000 p.close() print results 我

我已经尝试将for循环并行放置,以加快某些代码的速度。考虑这一点:

from multiprocessing import Pool

results = []

def do_stuff(str):
    print str
    results.append(str)

p = Pool(4)
p.map(do_stuff, ['str1','str2','str3',...]) # many strings here ~ 2000
p.close()

print results
我有一些来自
do_stuff
的调试消息,用来跟踪程序在死前能走多远。它似乎每次都在不同的地方死去。例如,它将打印'str297',然后它将停止运行,我将看到所有的CPU停止工作,程序就停在那里。应该出现一些错误,但没有显示错误消息。有人知道如何调试这个问题吗

更新

我试着重新编写代码。我没有使用
map
函数,而是尝试了
apply\u async
函数,如下所示:

        pool = Pool(5)
        results = pool.map(do_sym, underlyings[0::10])
        results = []
        for sym in underlyings[0::10]:
            r = pool.apply_async(do_sym, [sym])
            results.append(r)

        pool.close()
        pool.join()

        for result in results:
            print result.get(timeout=1000)
这与
map
功能一样有效,但最终以同样的方式挂起。它永远不会到达for循环,在那里它会打印结果

在做了更多的工作,并尝试了一些调试日志记录,就像在unutbu的回答中建议的那样,我将在这里提供更多的信息。这个问题很奇怪。看起来游泳池只是挂在那里,无法关闭并继续程序。我使用PyDev环境来测试我的程序,但我想我应该尝试在控制台中运行python。在控制台中,我得到了相同的行为,但当我按下control+C以终止程序时,我得到了一些输出,这可能解释了问题所在:

> KeyboardInterrupt ^CProcess PoolWorker-47: Traceback (most recent call
> last):   File "/usr/lib/python2.7/multiprocessing/process.py", line
> 258, in _bootstrap Process PoolWorker-48: Traceback (most recent call
> last):   File "/usr/lib/python2.7/multiprocessing/process.py", line
> 258, in _bootstrap Process PoolWorker-45: Process PoolWorker-46:
> Process PoolWorker-44:
>     self.run()   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
>     self._target(*self._args, **self._kwargs)   File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
> Traceback (most recent call last): Traceback (most recent call last):
> Traceback (most recent call last):   File
> "/usr/lib/python2.7/multiprocessing/process.py", line 258, in
> _bootstrap   File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap   File
> "/usr/lib/python2.7/multiprocessing/process.py", line 258, in
> _bootstrap
>     task = get()   File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get
>     self.run()   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
>     racquire()
>     self._target(*self._args, **self._kwargs)   File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
> KeyboardInterrupt
>     task = get()   File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get
>     self.run()
>     self.run()   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
>     self.run()   File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run   File
> "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
>     self._target(*self._args, **self._kwargs)   File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
>     self._target(*self._args, **self._kwargs)
>     self._target(*self._args, **self._kwargs)   File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker
>     racquire()   File "/usr/lib/python2.7/multiprocessing/pool.py", line 102, in worker KeyboardInterrupt
>     task = get()   File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get
>     task = get()
>     task = get()   File "/usr/lib/python2.7/multiprocessing/queues.py", line 376, in get  
> File "/usr/lib/python2.7/multiprocessing/queues.py", line 374, in get
>     racquire()
>     return recv()
>     racquire() KeyboardInterrupt KeyboardInterrupt KeyboardInterrupt
那么实际上这个程序永远不会死。我最后不得不关闭终端窗口来杀死它

更新2


我缩小了池中运行的函数内部的问题,是一个MySQL数据库事务导致了这个问题。我以前使用的是
MySQLdb
包。我将其切换为事务的a
pandas.read_sql
函数,它现在正在工作。

pool.map
在列表中返回结果。因此,不要在并发进程中调用
results.append
(这将不起作用,因为每个进程都有自己的
results
),而是将
results
分配给主进程中
pool.map
返回的值:

import multiprocessing as mp

def do_stuff(text):
    return text

if __name__ == '__main__':
    p = mp.Pool(4)
    tasks = ['str{}'.format(i) for i in range(2000)]
    results = p.map(do_stuff, tasks)
    p.close()

    print(results)
屈服

['str0', 'str1', 'str2', 'str3', ...]

调试使用多处理的脚本的一种方法是添加日志语句。为此,
多处理
模块提供了一个辅助功能。比如说,

import multiprocessing as mp
import logging

logger = mp.log_to_stderr(logging.DEBUG)

def do_stuff(text):
    logger.info('Received {}'.format(text))
    return text

if __name__ == '__main__':
    p = mp.Pool(4)
    tasks = ['str{}'.format(i) for i in range(2000)]
    results = p.map(do_stuff, tasks)
    p.close()

    logger.info(results)
这会产生如下日志输出:

[DEBUG/MainProcess] created semlock with handle 139824443588608
[DEBUG/MainProcess] created semlock with handle 139824443584512
[DEBUG/MainProcess] created semlock with handle 139824443580416
[DEBUG/MainProcess] created semlock with handle 139824443576320
[DEBUG/MainProcess] added worker
[INFO/PoolWorker-1] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/PoolWorker-2] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/PoolWorker-3] child process calling self.run()
[DEBUG/MainProcess] added worker
[INFO/PoolWorker-4] child process calling self.run()
[INFO/PoolWorker-1] Received str0
[INFO/PoolWorker-2] Received str125
[INFO/PoolWorker-3] Received str250
[INFO/PoolWorker-4] Received str375
[INFO/PoolWorker-3] Received str251
...
[INFO/PoolWorker-4] Received str1997
[INFO/PoolWorker-4] Received str1998
[INFO/PoolWorker-4] Received str1999
[DEBUG/MainProcess] closing pool
[INFO/MainProcess] ['str0', 'str1', 'str2', 'str3', ...]
[DEBUG/MainProcess] worker handler exiting
[DEBUG/MainProcess] task handler got sentinel
[INFO/MainProcess] process shutting down
[DEBUG/MainProcess] task handler sending sentinel to result handler
[DEBUG/MainProcess] running all "atexit" finalizers with priority >= 0
[DEBUG/MainProcess] finalizing pool
[DEBUG/MainProcess] task handler sending sentinel to workers
[DEBUG/MainProcess] helping task handler/workers to finish
[DEBUG/MainProcess] result handler got sentinel
[DEBUG/PoolWorker-3] worker got sentinel -- exiting
[DEBUG/MainProcess] removing tasks from inqueue until task handler finished
[DEBUG/MainProcess] ensuring that outqueue is not full
[DEBUG/MainProcess] task handler exiting
[DEBUG/PoolWorker-3] worker exiting after 2 tasks
[INFO/PoolWorker-3] process shutting down
[DEBUG/MainProcess] result handler exiting: len(cache)=0, thread._state=0
[DEBUG/PoolWorker-3] running all "atexit" finalizers with priority >= 0
[DEBUG/MainProcess] joining worker handler
[DEBUG/MainProcess] terminating workers
[DEBUG/PoolWorker-3] running the remaining "atexit" finalizers
[DEBUG/MainProcess] joining task handler
[DEBUG/MainProcess] joining result handler
[DEBUG/MainProcess] joining pool workers
[DEBUG/MainProcess] cleaning up worker 4811
[DEBUG/MainProcess] running the remaining "atexit" finalizers
请注意,每一行都指示哪个进程发出日志记录。因此,输出在某种程度上序列化了并发进程中的事件顺序


通过明智地放置
logging.info
调用,您应该能够缩小脚本“无声消亡”的位置和原因(或者,至少它不会像死去那样安静)。

结果将不会在进程之间共享。此外,每个进程将重新导入模块并创建新的
映射
函数。您需要在一个单独的
\uuuu main\uuu
块中设置它们。请看,结果实际上不需要共享,只要最终附加所有结果即可。这应该行得通,不是吗?谢谢这是一个很棒的insite,我已经更改了我的代码以这样操作,不幸的是map函数仍然在无声地消亡,我不知道确切的原因。你能发布一个“无声地消亡”的可运行示例吗?可能不行,正在迭代的函数相当长。我只是希望我能从中得到一条错误信息或是别的什么。从表面上看,它似乎完全停止了循环。正如您所看到的,我将结果打印为pool.close()之后的内容。结果有时会打印出来,有时则不会。我将试着确定一个例子。感谢您的帮助。我添加了一个建议,说明如何将日志语句添加到您的程序中。还要注意,如果计算受CPU限制,那么使用比机器拥有的处理器更多的进程没有任何好处。如果使用
Pool()
,则多处理模块将确定可用处理器的数量,并创建该大小的池。因此,通常在初始化
池()时不需要设置数字。