Python 用多个图像文件并行填充numpy 3d阵列

Python 用多个图像文件并行填充numpy 3d阵列,python,multithreading,image,numpy,Python,Multithreading,Image,Numpy,我想同时从磁盘上的文件加载多个灰度图像,并将它们放入一个大的numpy阵列中,以加快加载时间。基本代码如下所示: import numpy as np import matplotlib.pyplot as plt # prepare filenames image_files = ... mask_files = ... n_samples = len(image_files) # == len(mask_files) # preallocate space all_images = n

我想同时从磁盘上的文件加载多个灰度图像,并将它们放入一个大的numpy阵列中,以加快加载时间。基本代码如下所示:

import numpy as np
import matplotlib.pyplot as plt

# prepare filenames
image_files = ...
mask_files = ...
n_samples = len(image_files)  # == len(mask_files)

# preallocate space
all_images = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
all_masks = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)

# read images and masks
for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
    all_images[sample, :, :] = plt.imread(img_path)
    all_masks[sample, :, :] = plt.imread(mask_path)
我希望并行执行这个循环,但是,我知道Python真正的多线程功能由于GIL而受到限制


你有什么想法吗?

你可以试着为图像和面具做一个

import numpy as np
import matplotlib.pyplot as plt
from threading import Thread

# threading functions
def readImg(image_files, mask_files):
    for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
        all_images[sample, :, :] = plt.imread(img_path)

def readMask(image_files, mask_files):
    for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
        all_masks[sample, :, :] = plt.imread(mask_path)


# prepare filenames
image_files = ...
mask_files = ...
n_samples = len(image_files)  # == len(mask_files)

# preallocate space
all_images = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
all_masks = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)

# threading stuff
image_thread = Thread(target=readImg,
                                args=[image_files, mask_files])
mask_thread = Thread(target=readMask,
                               args=[image_files, mask_files])

image_thread.daemon = True
mask_thread.daemon = True

image_thread.start()
mask_thread.start()
警告:不要复制此代码。我也没有测试这个,只是为了得到它的要点

这不会使用多个内核,也不会像上面的代码那样线性执行。如果需要,则必须使用队列实现。尽管如此,我假设这不是您想要的,因为您说过您想要并发性,并且知道python线程上的解释器锁

编辑-根据您的评论,请参阅这篇关于使用多核的文章,要对上述示例进行更改,只需使用以下行

import multiprocessing.Process as Thread

它们共享一个类似的API。

正如您所注意到的,您的解决方案将只在单核上执行,因此我不希望单线程代码有任何改进。然而,如果有一些库能够在本机线程中执行类似的代码(绕过GIL),那将非常接近我想要的。在阅读了你的链接后,我发现了,所以我认为结合这两个答案可能会帮助我解决我的问题。我以后再调查。谢谢