对象中的Python多处理
我正在编写一个程序,其中数量可变的代理对象同时运行多个串行方法,并将其返回值存储在队列属性中。每个代理都有一个单独的Worker(进程的子类)属性,并为其提供作业,以通过cmd_队列串行运行。代理从res_队列中的工作进程获取结果。这些是当前的Manager().Queue()实例,其原因如下:对象中的Python多处理,python,queue,multiprocessing,consumer,producer,Python,Queue,Multiprocessing,Consumer,Producer,我正在编写一个程序,其中数量可变的代理对象同时运行多个串行方法,并将其返回值存储在队列属性中。每个代理都有一个单独的Worker(进程的子类)属性,并为其提供作业,以通过cmd_队列串行运行。代理从res_队列中的工作进程获取结果。这些是当前的Manager().Queue()实例,其原因如下: TypeError:出于安全原因,不允许对AuthenticationString对象进行pickle处理但是,如果我使用常规Queue.Queue,工作人员将获得代理的cmd_队列的副本,并且看不到代
TypeError:出于安全原因,不允许对AuthenticationString对象进行pickle处理
但是,如果我使用常规Queue.Queue,工作人员将获得代理的cmd_队列的副本,并且看不到代理添加了什么(它总是空的)
我能够使用此问题中引用的解决方案pickle实例方法:
我的问题是,我将如何使用多处理使代码工作,或者是否有更好的、更被接受的方法使此模式工作?这是一个更大框架的一个小部分,所以我希望它尽可能对OO友好
编辑:这是在Python2.7中。您是否愿意使用一种非常温和的
多处理方法来实现此模式?如果是这样的话,您只需进一步查看您在问题中提到的链接:
正如pathos.multiprocessing
有一个Pool
,它可以以非常干净的方式pickle实例方法,您可以像在串行python中编写代码一样工作……而且它可以工作……甚至可以直接从解释器中进行
>>> from pathos.multiprocessing import ProcessingPool as Pool
>>> from Queue import Queue
>>> from time import sleep
>>>
>>> class Agent:
... def __init__(self):
... self.pool = Pool()
... self.queue = Queue()
... def produce(self, f, *args, **kwds):
... self.queue.put(self.pool.apipe(f, *args, **kwds))
... def do_some_work(self):
... self.produce(self.foo, waka='waka')
... def do_some_other_work(self):
... self.produce(self.bar, humana='humana')
... def foo(self, **kwds):
... sleep(5)
... return 'this is a foo'
... def bar(self, **kwds):
... sleep(10)
... return 'this is a bar'
... def get_results(self):
... res = []
... while not self.queue.empty():
... res.append(self.queue.get().get())
... return res
...
>>> agents = [Agent() for i in range(50)]
>>> for agent in agents:
... agent.do_some_work()
... agent.do_some_other_work()
...
>>> for agent in agents:
... print(agent.get_results())
...
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
>>>
获取pathos
此处:您可以使用普通的多处理.Queue来执行此操作。您只需要调整代理
类,这样当代理
类本身被pickle时,它就不会尝试pickle队列
实例。这是必需的,因为在pickle发送给工作者的实例方法时,必须pickle代理
实例本身。不过,这很容易做到:
class Agent(object): # Agent is now a new-style class
def __init__(self):
self.cmd_queue = Queue()
self.res_queue = Queue()
self.worker = Worker(self.cmd_queue, self.res_queue)
self.worker.start()
def __getstate__(self):
""" This is called to pickle the instance """
self_dict = self.__dict__.copy()
del self_dict['cmd_queue']
del self_dict['res_queue']
del self_dict['worker']
return self_dict
def __setstate__(self, self_dict):
""" This is called to unpickle the instance. """
self.__dict__ = self_dict
... # The rest is the same.
注意,这段代码中还有一些其他的逻辑问题使它无法正常运行get_results
并不能真正实现您所期望的功能,因为这很容易受到比赛条件的影响:
while not self.cmd_queue.empty():#wait for Worker to finish
sleep(.5)
while not self.res_queue.empty():
res.append(self.res_queue.get())
cmd_queue
在实际传递给它的函数在Worker
内部运行之前,可能(并且确实,在您的示例代码中)会变成空的,这意味着当您将所有内容从res_queue
中取出时,您的一些结果将丢失。你可以通过使用一个按钮来解决这个问题,这样工人们就可以在完成任务时发出信号
您还应该向工作进程发送一个sentinel,以便它们正确关闭,并将所有结果从res\u队列
中刷新并正确发送回父进程。我还发现我需要在res\u queue
中添加一个哨兵,否则有时res\u queue
会在父级中显示为空,而子级写入的最后一个结果实际上会被管道刷新,这意味着最后一个结果会丢失
下面是一个完整的工作示例:
from multiprocessing import Process, Queue, JoinableQueue
import types
from time import sleep
import copy_reg
def _pickle_method(method):
func_name = method.im_func.__name__
obj = method.im_self
cls = method.im_class
return _unpickle_method, (func_name, obj, cls)
def _unpickle_method(func_name, obj, cls):
for cls in cls.mro():
try:
func = cls.__dict__[func_name]
except KeyError:
pass
else:
break
return func.__get__(obj, cls)
copy_reg.pickle(types.MethodType, _pickle_method, _unpickle_method)
class Worker(Process):
def __init__(self, cmd_queue, res_queue):
self.cmd_queue = cmd_queue
self.res_queue = res_queue
Process.__init__(self)
def run(self):
for f, args, kwargs in iter(self.cmd_queue.get,
(None, (), {})): # None is our sentinel
self.res_queue.put( f(*args, **kwargs) )
self.cmd_queue.task_done() # Mark the task as done.
self.res_queue.put(None) # Send this to indicate no more results are coming
self.cmd_queue.task_done() # Mark the task as done
class Agent(object):
def __init__(self):
self.cmd_queue = JoinableQueue()
self.res_queue = Queue()
self.worker = Worker(self.cmd_queue, self.res_queue)
self.worker.start()
def __getstate__(self):
self_dict = self.__dict__.copy()
del self_dict['cmd_queue']
del self_dict['res_queue']
del self_dict['worker']
return self_dict
def __setstate__(self, self_dict):
self.__dict__ = self_dict
def produce(self, f, *args, **kwargs):
self.cmd_queue.put((f, args, kwargs))
def do_some_work(self):
self.produce(self.foo, waka='waka')
def do_some_other_work(self):
self.produce(self.bar, humana='humana')
def send_sentinel(self):
self.produce(None)
def foo(self, **kwargs):
sleep(2)
return('this is a foo')
def bar(self, **kwargs):
sleep(4)
return('this is a bar')
def get_results(self): #blocking call
res = []
self.cmd_queue.join() # This will block until task_done has been called for every put pushed into the queue.
for out in iter(self.res_queue.get, None): # None is our sentinel
res.append(out)
return res
#This is the interface I'm looking for.
if __name__=='__main__':
agents = [Agent() for i in range(50)]
#this should flow quickly as the calls are added to cmd_queues
for agent in agents:
agent.do_some_work()
agent.do_some_other_work()
agent.send_sentinel()
for agent in agents:
print(agent.get_results())
输出:
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
这也是一个很好的解决方案。它确实需要一些必要的头部站立,才能使多处理
发挥作用。。。我可能会建议,让生活简单一点,将16行(从import copy\u reg
到copy\u reg.pickle(…)
)替换为一行:import dill
。这一行注册了type.Method
,与您的16行相同。它在标准库之外添加了一个依赖项,但除非您是一个绝对的纯粹主义者,否则这不重要……谢谢您,这是巨大的。我可以使用一些全局字典来运行它,这些字典的键是每个代理/工作者对的UUID,值是Manager.Queues(还有一个带有您提到的竞争条件的事件),但它有点不稳定。这是所有相关SO问题的答案。为什么在Worker中设置self.daemon=True
?这是允许它在将来创建流程的健壮性问题,还是您的解决方案需要它?@PrckPgn实际上,这根本不应该存在。起初我使用守护进程而不是Sentinel,但是Sentinel更干净,所以我切换到了这一点。我只是忘记移动self.daemon=True
行。
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']
['this is a foo', 'this is a bar']