Python 清理从Flask MethodView API启动的长时间运行的子进程
我正在构建一个驱动API。对于特定的端点,我使用请求数据来启动一个可能长时间运行的命令。我没有等待命令完成,而是将其包装在一个Python 清理从Flask MethodView API启动的长时间运行的子进程,python,flask,multiprocessing,Python,Flask,Multiprocessing,我正在构建一个驱动API。对于特定的端点,我使用请求数据来启动一个可能长时间运行的命令。我没有等待命令完成,而是将其包装在一个多处理.Process中,调用start,然后向用户返回一个HTTP 202以及一个他们可以用来监视进程状态的url class EndPointAPI(MethodView): def __init__(self): """ On init, filter requests missing JSON body.""" # Ch
多处理.Process
中,调用start,然后向用户返回一个HTTP 202以及一个他们可以用来监视进程状态的url
class EndPointAPI(MethodView):
def __init__(self):
""" On init, filter requests missing JSON body."""
# Check for json payload
self.except = ["GET", "PUT", "DELETE" ]
if (request.method not in self.except) and not request.json:
abort(400)
def _long_running_function(self, json_data):
"""
In this function, I use the input JSON data
to write a script to the file system, then
use subprocess.run to execute it.
"""
return
def post(self):
""" """
# Get input data
json_data = request.json
# Kick off the long running function
p = Process(target=long_running_function, args=(json_data,))
p.start()
response = {
"result" : "job accepted",
"links" : {
"href" : "/monitor_job/",
}
}
return jsonify(response), 202
看起来在post
方法中启动的进程在完成后会变成僵尸,但我不知道如何在不阻止父方法执行的情况下正确跟踪和清理它们。我尝试实现中建议的监视线程。据我所知,它建议运行一个单独的线程来监视FIFO队列,然后在返回父函数之前将进程句柄放入队列中。我尝试了一个实现(如下所示),但似乎无法将process对象传递到线程中,因为它包含受保护的AuthenticationString
属性
Traceback (most recent call last):
| File "/opt/miniconda3/envs/m137p3/lib/python3.6/multiprocessing/queues.py", line 234, in _feed
| obj = _ForkingPickler.dumps(obj)
| File "/opt/miniconda3/envs/m137p3/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps
| cls(buf, protocol).dump(obj)
| File "/opt/miniconda3/envs/m137p3/lib/python3.6/multiprocessing/process.py", line 291, in __reduce__
| 'Pickling an AuthenticationString object is '
| TypeError: Pickling an AuthenticationString object is disallowed for security reasons
这是我在不阻塞父进程的情况下加入进程的Python实现。我不知道这是否有效,因为上述错误从一开始就关闭了整个系统。非常感谢您对如何在不阻塞调用方法的情况下负责任地启动这些进程的任何想法或建议
from threading import Thread
from multiprocessing import Queue, ...
class Joiner(Thread):
def __init__(self, q):
super().__init__()
self.__q = q
def run(self):
while True:
child = self.__q.get()
if child == None:
return
child.join()
class EndPointAPI(MethodView):
def __init__(self):
""" On init, filter requests missing JSON body."""
self._jobs = Queue()
self._babysitter = Joiner(self._jobs)
self._babysitter.start()
# Check for json payload
self.except = ["GET", "PUT", "DELETE" ]
if (request.method not in self.except) and not request.json:
abort(400)
def _long_running_function(self, json_data):
"""
In this function, I use the input JSON data
to write a script to the file system, then
use subprocess.run to execute it.
"""
return
def post(self):
""" """
# Get input data
json_data = request.json
# Kick off the long running function
p = Process(target=long_running_function, args=(json_data,))
p.start()
self._jobs.put(p)
response = {
"result" : "job accepted",
"links" : {
"href" : "/monitor_job/",
}
}
return jsonify(response), 202
您非常接近:)除了一件事之外,一切看起来都很好,您正在使用multiprocessing.Queue
存储正在运行的进程,以便稍后使用Joiner
实例将它们连接起来。从中,您将了解以下内容
注意:将对象放入队列时,对象将被pickle并
后台线程稍后会将pickle数据刷新到底层线程
烟斗
也就是说,进程在放入队列时被序列化,这将导致以下错误
TypeError:不允许对AuthenticationString对象进行酸洗
安全原因
这是因为每个进程都有一个唯一的。此密钥是一个字节字符串,可以将其视为密码,其类型为多处理.process.AuthenticationString
,不能被pickle
解决方案很简单,只需使用queue.queue
实例来存储长期运行的进程。以下是一个工作示例:
#!/usr/bin/env python3
import os
import time
from queue import Queue
from threading import Thread
from multiprocessing import Process
class Joiner(Thread):
def __init__(self):
super().__init__()
self.workers = Queue()
def run(self):
while True:
worker = self.workers.get()
if worker is None:
break
worker.join()
def do_work(t):
pid = os.getpid()
print('Process', pid, 'STARTED')
time.sleep(t)
print('Process', pid, 'FINISHED')
if __name__ == '__main__':
joiner = Joiner()
joiner.start()
for t in range(1, 6, 2):
p = Process(target=do_work, args=(t,))
p.start()
joiner.workers.put(p)
joiner.workers.put(None)
joiner.join()
输出:
Process 14498 STARTED
Process 14500 STARTED
Process 14499 STARTED
Process 14498 FINISHED
Process 14499 FINISHED
Process 14500 FINISHED
谢谢你的回答!您是否理解为什么multiprocessing.Queue类在这方面的行为与Queue.Queue类不同?我很难理解这两种情况与它们的预期用途之间的区别。@jeremiahbuddha这是因为进程不共享内存空间,不像线程在同一内存空间中运行,这就是为什么要在进程之间共享对象,必须先序列化它,然后将它传输到另一个进程。