Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/353.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python多处理锁不';t同步多处理进程子类_Python_Python Multiprocessing - Fatal编程技术网

Python多处理锁不';t同步多处理进程子类

Python多处理锁不';t同步多处理进程子类,python,python-multiprocessing,Python,Python Multiprocessing,我有大约1000个图像需要单独处理并将结果保存到磁盘上。该进程本身占用大量CPU,因此我决定将其子类化为multiprocessing.process,并使用一个全局缓冲区来接收结果。缓冲区将具有一定的大小,超过其大小后将刷新到磁盘中 class ImageBuffer: def __init__(self): self.buffer = [] self.size = 0 self.bulk_index = 0 def add(s

我有大约1000个图像需要单独处理并将结果保存到磁盘上。该进程本身占用大量CPU,因此我决定将其子类化为multiprocessing.process,并使用一个全局缓冲区来接收结果。缓冲区将具有一定的大小,超过其大小后将刷新到磁盘中

class ImageBuffer:
    def __init__(self):
        self.buffer = []
        self.size = 0
        self.bulk_index = 0

    def add(self, data):
        if self.size == 2000:
            self.persist()
        self.buffer.append(data)
        self.size += 1
        print(f"Adding new image. Current size {self.size}")

    def persist(self):
        self.size = 0
        pass
同步应在每个过程中进行:

from multiprocessing import Process
import multiprocessing


class ImageWorker(Process):
    def __init__(self, buffer: ImageBuffer, lock: multiprocessing.Lock):
        super().__init__()
        self.tasks = []
        self.buffer = buffer
        self.lock = lock

    def add_task(self, task):
        self.tasks.append(task)

    def run(self):
        assert len(self.tasks) != 0
        for _ in range(len(self.tasks)):
            task = self.tasks.pop(0)
            result = process_task(task)

            self.lock.acquire()
            self.buffer.add(result)
            self.lock.release()
这就是我启动流程的方式:

from .image_worker import ImageWorker
from .image_buffer import ImageBuffer
import multiprocessing


class ImageWorkerPool:
    def __init__(self, num_threads=multiprocessing.cpu_count()):
        self.workers = []
        self.work_index = 0
        self.buffer = ImageBuffer()

        lock = multiprocessing.Lock()
        for _ in range(num_threads):
            self.workers.append(ImageWorker(self.buffer, lock))

    def add_task(self, _image_mask):
        self.workers[self.work_index].add_task(_image_mask)
        self.work_index += 1
        self.work_index = self.work_index % len(self.workers)
        assert self.work_index < len(self.workers)

    def start(self):
        for worker in self.workers:
            worker.start()

    def complete(self):
        for worker in self.workers:
            worker.join()
        self.buffer.persist()

这是因为有多个工人有自己的规模。它们不一定以有序的方式工作,这就是为什么您会看到每个多个worker都在向标准输出写入数据。
ImageBuffer
是一个常规对象。它不是在进程之间共享的,而是复制的。没有全局缓冲区。请注意,如果您的任务是IO密集型的,则进程将没有帮助。在最坏的情况下,它们会增加延迟,因为通信是IO的一种形式。如果您的任务没有CPU限制,请对IO密集型任务使用线程。@MisterMiyagi这是我的错误。此外,
IO密集型
也是一个打字错误。我是说CPU
Adding new image. Current size 1
Adding new image. Current size 2
Adding new image. Current size 3
Adding new image. Current size 4
Adding new image. Current size 5
Adding new image. Current size 1
Adding new image. Current size 6
Adding new image. Current size 7
Adding new image. Current size 2
Adding new image. Current size 3
Adding new image. Current size 4
Adding new image. Current size 8
Adding new image. Current size 5
Adding new image. Current size 6
Adding new image. Current size 1
Adding new image. Current size 9
Adding new image. Current size 7
Adding new image. Current size 10