Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/spring-mvc/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
批加载和处理图像的最具python风格的方法_Python - Fatal编程技术网

批加载和处理图像的最具python风格的方法

批加载和处理图像的最具python风格的方法,python,Python,下面的代码将jpeg图像加载到numpy ndarray数组中。目前它工作得很好,但我觉得必须有一种更像蟒蛇的方式来做到这一点 import scipy.ndimage as spimg import numpy as np # Read images into scipy and flatten to greyscale # Using generator function instead of list comprehension # for memory efficiency huma

下面的代码将jpeg图像加载到numpy ndarray数组中。目前它工作得很好,但我觉得必须有一种更像蟒蛇的方式来做到这一点

import scipy.ndimage as spimg
import numpy as np


# Read images into scipy and flatten to greyscale
# Using generator function instead of list comprehension
# for memory efficiency
human_files_convert = (spimg.imread(path, flatten=True) for path in human_files[:2099])
使用上面的生成器函数,以便单独处理每个图像,此处的列表理解失败

batch_size = 1000
step = 0
human_files_ndarray = np.empty((1, 250, 250))

# Create empty list, to append empty image arrays
human_files_list = []
batch = 1
total_processed = 0

# iterate through image arrays
for path in human_files_convert:
    # Append to list
    human_files_list.append(path)
    # Stack list of arrays
    step += 1
    total_processed += 1
    if (step % batch_size == 0) or (len(human_files[:2099]) == total_processed):
        new_stack = np.stack(human_files_list)
        print("Batch: ", batch)
        print(new_stack.shape)
        step = 0
        human_files_ndarray = np.concatenate((human_files_ndarray, new_stack))
        print(human_files_ndarray.shape)
        print(total_processed)
        # Create empty list, to append empty image arrays
        human_files_list = []
        batch += 1

关于如何使此代码更高效或更具pythonic的想法?

根据上面@sascha的建议,我将生成器函数的输出发送到一个文件。执行此操作将集合的内存使用率从>4GB降至小于200MB。额外的好处是我现在有一个加载数据集的磁盘副本,很像一个pickle文件

# Confirm correct import of images
import scipy.ndimage as spimg
import numpy as np
import h5py
import tqdm

np.set_printoptions(threshold=1000)

# Use h5py to store large uncompressed image arrays
img = h5py.File("images.hdf5", "w")
human_dset = img.create_dataset("human_images", (len(human_files), 250, 250))

# Read images into scipy and flatten to greyscale
# Using generator function instead of list comprehension
# for memory efficiency
slice = len(human_files)
human_files_convert = (spimg.imread(path, flatten=True) for path in human_files[:slice])

i = 0
for r in tqdm.tqdm(human_files_convert, total=slice):
    # Rescale [0,255] --> [0,1]
    r = r.astype('float32')/255
    # Insert Row into dset
    human_dset[i] = r
    i += 1
img.close()

你到底在那里干什么?有什么想法?(x,250250)的形状看起来就像你只想要
imgs=np.stack(human\u files\u convert)
?您有哪种内存不足(毕竟您将有一个密集的输出阵列)?如果有,你真的想把它们全部加载到内存中(与HDF5或类似的东西相反)?我正在尝试将所有图像文件堆叠到一个数组中。就像你刚才提到的。但由于内存限制,我正在尝试批处理转换。我一定会给你一个尝试的想法。这很有魅力。我知道我是在用核大锤打击它。如果你愿意回答,我就接受。谢谢。恐怕我不喜欢把它作为回答。很好,现在对你有用了。我的假设是,在某些有限的场景中,批处理有帮助。简化猜测:如果一个图像接着一个图像添加,内存使用量应该是
x+eps
,而一次添加所有图像最多应该是
x*2
。现在,这似乎只在您处于这种特殊内存限制的情况下才相关,例如,您的最终数组占用了大约80%的内存。@sascha my final array确实占用了我大约80%-90%的内存,而这只是我整个数据集的一部分。我来看看你上面提到的话题。