如何有效地完成许多任务a“；稍晚一点”；用Python？_Python_Concurrency_Gevent_Eventlet

如何有效地完成许多任务a“；稍晚一点”；用Python？
python concurrency
如何有效地完成许多任务a“；稍晚一点”；用Python？,python,concurrency,gevent,eventlet,Python,Concurrency,Gevent,Eventlet,我有一个流程，需要“稍后”（通常在10-60秒后）执行一系列操作。问题是那些“以后”的动作可能会很多（1000秒），所以在每个任务中使用线程是不可行的。我知道存在像和这样的工具，但问题之一是流程用于通信，因此我需要一些集成（eventlet已经有了它）我想知道的是我的选择是什么？因此，欢迎您提出建议，包括库（如果您使用过上述任何库，请分享您的经验）、技术（使用一个休眠一段时间并检查队列的线程）、如何利用zeromq的轮询或eventloop来完成此项工作，或者别的什么。考虑使用一个或多个辅助线
我有一个流程，需要“稍后”（通常在10-60秒后）执行一系列操作。问题是那些“以后”的动作可能会很多（1000秒），所以在每个任务中使用
线程是不可行的。我知道存在像和这样的工具，但问题之一是流程用于通信，因此我需要一些集成（eventlet已经有了它）
我想知道的是我的选择是什么？因此，欢迎您提出建议，包括库（如果您使用过上述任何库，请分享您的经验）、技术（使用一个休眠一段时间并检查队列的线程）、如何利用zeromq的轮询或eventloop来完成此项工作，或者别的什么。
考虑使用一个或多个辅助线程来服务任务。主线程可以向队列中添加工作，时间戳为应该服务的最快时间。工作线程将工作从队列中弹出，睡眠直到达到优先级值的时间，执行该工作，然后从队列中弹出另一项
class SchedulingThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
        self.scheduler = sched.scheduler(time.time, time.sleep)
        self.queue = []
        self.queue_lock = threading.Lock()
        self.scheduler.enter(1, 1, self._schedule_in_scheduler, ())

一个更充实的答案怎么样。mklauber提出了一个很好的观点。如果您有新的、更紧急的工作时，您的所有员工可能都在睡觉，那么队列。PriorityQueue
并不是真正的解决方案，尽管“优先级队列”仍然是可以使用的技术，可从heapq
模块获得。相反，我们将使用不同的同步原语；一个条件变量，在python中拼写为threading.condition

这种方法相当简单，只需查看堆，如果工作是最新的，则将其弹出并执行该工作。如果有工作，但它被安排在未来，只要等待条件，直到那时，或者如果没有任何工作，永远睡觉
制作人完成了作品的公平份额；每次添加新工作时，它都会通知情况，因此如果有正在睡觉的工作人员，他们会醒来并重新检查队列中是否有新的工作
import heapq, time, threading

START_TIME = time.time()
SERIALIZE_STDOUT = threading.Lock()
def consumer(message):
    """the actual work function.  nevermind the locks here, this just keeps
       the output nicely formatted.  a real work function probably won't need
       it, or might need quite different synchronization"""
    SERIALIZE_STDOUT.acquire()
    print time.time() - START_TIME, message
    SERIALIZE_STDOUT.release()

def produce(work_queue, condition, timeout, message):
    """called to put a single item onto the work queue."""
    prio = time.time() + float(timeout)
    condition.acquire()
    heapq.heappush(work_queue, (prio, message))
    condition.notify()
    condition.release()

def worker(work_queue, condition):
    condition.acquire()
    stopped = False
    while not stopped:
        now = time.time()
        if work_queue:
            prio, data = work_queue[0]
            if data == 'stop':
                stopped = True
                continue
            if prio < now:
                heapq.heappop(work_queue)
                condition.release()
                # do some work!
                consumer(data)
                condition.acquire()
            else:
                condition.wait(prio - now)
        else:
            # the queue is empty, wait until notified
            condition.wait()
    condition.release()

if __name__ == '__main__':
    # first set up the work queue and worker pool
    work_queue = []
    cond = threading.Condition()
    pool = [threading.Thread(target=worker, args=(work_queue, cond))
            for _ignored in range(4)]
    map(threading.Thread.start, pool)

    # now add some work
    produce(work_queue, cond, 10, 'Grumpy')
    produce(work_queue, cond, 10, 'Sneezy')
    produce(work_queue, cond, 5, 'Happy')
    produce(work_queue, cond, 10, 'Dopey')
    produce(work_queue, cond, 15, 'Bashful')
    time.sleep(5)
    produce(work_queue, cond, 5, 'Sleepy')
    produce(work_queue, cond, 10, 'Doc')

    # and just to make the example a bit more friendly, tell the threads to stop after all
    # the work is done
    produce(work_queue, cond, float('inf'), 'stop')
    map(threading.Thread.join, pool)

导入heapq、时间、线程
开始时间=TIME.TIME（）
SERIALIZE_STDOUT=threading.Lock（）
def消费者（信息）：
“”“实际的功函数。不管在这里锁，这只是保持
输出格式很好。实际的工作函数可能不需要
它，或者可能需要完全不同的同步
序列化_STDOUT.acquire（）
打印时间.time（）-开始时间，消息
序列化_STDOUT.release（）
def生成（工作队列、条件、超时、消息）：
“”“调用以将单个项目放入工作队列。”“”
prio=time.time（）+浮点（超时）
条件获取（）
heapq.heappush（工作队列，（优先级，消息））
条件通知
条件.释放（）
def工作人员（工作队列、条件）：
条件获取（）
停止=错误
虽然没有停止：
now=time.time（）
如果工作队列：
优先级，数据=工作队列[0]
如果数据==“停止”：
停止=真
持续
如果prio<现在：
heapq.heappop（工作队列）
条件.释放（）
#做点工作！
消费者（数据）
条件获取（）
其他：
条件。等待（优先级-现在）
其他：
#队列为空，请等待通知
条件。等待（）
条件.释放（）
如果uuuu name uuuuuu='\uuuuuuu main\uuuuuuu'：
#首先设置工作队列和工作池
工作队列=[]
cond=threading.Condition（）
池=[threading.Thread（目标=工作者，参数=（工作队列，条件））
对于在范围（4）内忽略的_）]
映射（threading.Thread.start，池）
#现在添加一些工作
生产（工作队列，条件，10，‘脾气暴躁’）
生产（工作队列，条件，10，‘打喷嚏’）
生产（工作队列，条件，5，‘快乐’）
生产（工作队列，条件，10，‘笨蛋’）
生产（工作队列，条件，15，‘害羞’）
时间。睡眠（5）
生产（工作队列，条件，5，‘嗜睡’）
生产（工作队列，条件，10，‘文件’）
#为了使示例更友好一些，告诉线程停止
#工作完成了
生产（工作队列、条件、浮点（'inf'）、'stop'）
映射（threading.Thread.join，池）
这个答案实际上有两条建议——我的第一条建议和我在第一条建议之后发现的另一条建议
附表
我猜你是在找那个人
编辑：我的建议读过之后似乎没有什么帮助。因此，我决定测试sched
模块，看看它是否能像我建议的那样工作。下面是我的测试：我会用一根鞋底线，或多或少是这样：
class SchedulingThread(threading.Thread):

    def __init__(self):
        threading.Thread.__init__(self)
        self.scheduler = sched.scheduler(time.time, time.sleep)
        self.queue = []
        self.queue_lock = threading.Lock()
        self.scheduler.enter(1, 1, self._schedule_in_scheduler, ())

    def run(self):
        self.scheduler.run()

    def schedule(self, function, delay):
        with self.queue_lock:
            self.queue.append((delay, 1, function, ()))

    def _schedule_in_scheduler(self):
        with self.queue_lock:
            for event in self.queue:
                self.scheduler.enter(*event)
                print "Registerd event", event
            self.queue = []
        self.scheduler.enter(1, 1, self._schedule_in_scheduler, ())

首先，我要创建一个thread类，它有自己的调度器和队列。调度程序中至少会注册一个事件：一个用于调用方法来调度队列中的事件
class SchedulingThread(threading.Thread):
    def __init__(self):
        threading.Thread.__init__(self)
        self.scheduler = sched.scheduler(time.time, time.sleep)
        self.queue = []
        self.queue_lock = threading.Lock()
        self.scheduler.enter(1, 1, self._schedule_in_scheduler, ())

从队列调度事件的方法将锁定队列、调度每个事件、清空队列并再次调度自身，以便在将来某个时间查找新事件。请注意，查找新事件的时间很短（1秒），您可以更改它：
    def _schedule_in_scheduler(self):
        with self.queue_lock:
            for event in self.queue:
                self.scheduler.enter(*event)
                print "Registerd event", event
            self.queue = []
        self.scheduler.enter(1, 1, self._schedule_in_scheduler, ())

该类还应该有一个用于调度用户事件的方法。当然，此方法应该在更新队列时锁定队列：
    def schedule(self, function, delay):
        with self.queue_lock:
            self.queue.append((delay, 1, function, ()))

最后，该类应调用调度程序主方法：
    def run(self):
        self.scheduler.run()

下面是一个使用的示例：
def print_time():
    print "scheduled:", time.time()


if __name__ == "__main__":
    st = SchedulingThread()
    st.start()          
    st.schedule(print_time, 10)

    while True:
        print "main thread:", time.time()
        time.sleep(5)

    st.join()

其在我的机器中的输出为：
$ python schedthread.py
main thread: 1311089765.77
Registerd event (10, 1, <function print_time at 0x2f4bb0>, ())
main thread: 1311089770.77
main thread: 1311089775.77
scheduled: 1311089776.77
main thread: 1311089780.77
main thread: 1311089785.77

（不幸的是，我没有找到如何安排一个事件只执行一次，因此函数事件应该自行取消计划。我打赌它可以通过一些装饰程序来解决。）
您看过多处理模块了吗？它是Python的标准配置。它类似于线程化
模块，但在一个进程中运行每个任务。您可以使用Pool（）
对象设置工作池，然后使用.map（）
方法调用具有各种排队任务参数的函数。
    signal.alarm(time)
If time is non-zero, this function requests that a 
SIGALRM signal be sent to the process in time seconds. 

>>> import time
>>> import glib
>>> 
>>> def workon(thing):
...     print("%s: working on %s" % (time.time(), thing))
...     return True # use True for repetitive and False for one-time tasks
... 
>>> ml = glib.MainLoop()
>>> 
>>> glib.timeout_add(1000, workon, "this")
2
>>> glib.timeout_add(2000, workon, "that")
3
>>> 
>>> ml.run()
1311343177.61: working on this
1311343178.61: working on that
1311343178.61: working on this
1311343179.61: working on this
1311343180.61: working on this
1311343180.61: working on that
1311343181.61: working on this
1311343182.61: working on this
1311343182.61: working on that
1311343183.61: working on this

from celery.task import task

@task
def add(x, y):
    return x + y

>>> result = add.delay(8, 8)
>>> result.wait() # wait for and return the result
16

from celery.decorators import task

@task
def my_task(arg1, arg2):
    pass # Do something

result = my_task.apply_async(
    args=[sth1, sth2], # Arguments that will be passed to `my_task()` function.
    countdown=3, # Time in seconds to wait before queueing the task.
)