Python 用多个图像文件并行填充numpy 3d阵列
我想同时从磁盘上的文件加载多个灰度图像,并将它们放入一个大的numpy阵列中,以加快加载时间。基本代码如下所示:Python 用多个图像文件并行填充numpy 3d阵列,python,multithreading,image,numpy,Python,Multithreading,Image,Numpy,我想同时从磁盘上的文件加载多个灰度图像,并将它们放入一个大的numpy阵列中,以加快加载时间。基本代码如下所示: import numpy as np import matplotlib.pyplot as plt # prepare filenames image_files = ... mask_files = ... n_samples = len(image_files) # == len(mask_files) # preallocate space all_images = n
import numpy as np
import matplotlib.pyplot as plt
# prepare filenames
image_files = ...
mask_files = ...
n_samples = len(image_files) # == len(mask_files)
# preallocate space
all_images = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
all_masks = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
# read images and masks
for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
all_images[sample, :, :] = plt.imread(img_path)
all_masks[sample, :, :] = plt.imread(mask_path)
我希望并行执行这个循环,但是,我知道Python真正的多线程功能由于GIL而受到限制
你有什么想法吗?你可以试着为图像和面具做一个
import numpy as np
import matplotlib.pyplot as plt
from threading import Thread
# threading functions
def readImg(image_files, mask_files):
for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
all_images[sample, :, :] = plt.imread(img_path)
def readMask(image_files, mask_files):
for sample, (img_path, mask_path) in enumerate(zip(image_files, mask_files)):
all_masks[sample, :, :] = plt.imread(mask_path)
# prepare filenames
image_files = ...
mask_files = ...
n_samples = len(image_files) # == len(mask_files)
# preallocate space
all_images = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
all_masks = np.empty(shape=(n_samples, IMG_HEIGHT, IMG_WIDTH), dtype=np.float32)
# threading stuff
image_thread = Thread(target=readImg,
args=[image_files, mask_files])
mask_thread = Thread(target=readMask,
args=[image_files, mask_files])
image_thread.daemon = True
mask_thread.daemon = True
image_thread.start()
mask_thread.start()
警告:不要复制此代码。我也没有测试这个,只是为了得到它的要点
这不会使用多个内核,也不会像上面的代码那样线性执行。如果需要,则必须使用队列实现。尽管如此,我假设这不是您想要的,因为您说过您想要并发性,并且知道python线程上的解释器锁
编辑-根据您的评论,请参阅这篇关于使用多核的文章,要对上述示例进行更改,只需使用以下行
import multiprocessing.Process as Thread
它们共享一个类似的API。正如您所注意到的,您的解决方案将只在单核上执行,因此我不希望单线程代码有任何改进。然而,如果有一些库能够在本机线程中执行类似的代码(绕过GIL),那将非常接近我想要的。在阅读了你的链接后,我发现了,所以我认为结合这两个答案可能会帮助我解决我的问题。我以后再调查。谢谢