在python中仅加载图像的一部分_Python_Numpy_Scipy_Python Imaging Library

在python中仅加载图像的一部分

python numpy

在python中仅加载图像的一部分,python,numpy,scipy,python-imaging-library,Python,Numpy,Scipy,Python Imaging Library,这可能是个愚蠢的问题，但是我有几千个图像，我想加载到Python中，然后转换成numpy数组。显然，这进展有点慢。但是，我实际上只对每张图片的一小部分感兴趣。（相同的部分，仅在图像中心有100x100像素。）有没有办法只加载图像的一部分以加快速度这里是一些示例代码，我在其中生成一些示例图像，保存它们，然后将它们重新加载 import numpy as np import matplotlib.pyplot as plt import Image, time #Generate sampl

这可能是个愚蠢的问题，但是

我有几千个图像，我想加载到Python中，然后转换成numpy数组。显然，这进展有点慢。但是，我实际上只对每张图片的一小部分感兴趣。（相同的部分，仅在图像中心有100x100像素。）

有没有办法只加载图像的一部分以加快速度

这里是一些示例代码，我在其中生成一些示例图像，保存它们，然后将它们重新加载

import numpy as np
import matplotlib.pyplot as plt
import Image, time

#Generate sample images
num_images = 5

for i in range(0,num_images):
    Z = np.random.rand(2000,2000)
    print 'saving %i'%i
    plt.imsave('%03i.png'%i,Z)

%load the images
for i in range(0,num_images):
    t = time.time()

    im = Image.open('%03i.png'%i)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))

    print 'Time to open: %.4f seconds'%(time.time()-t)

    #convert them to numpy arrays
    data = np.array(imc)

我已经运行了一些计时测试，我很抱歉地说，我不认为您可以得到比PIL crop命令快得多的速度。即使使用手动查找/低电平读取，您仍然必须读取字节。以下是计时结果：

%timeit im.crop((1000-50,1000-50,1000+50,1000+50))
fid = open('003.png','rb')
%timeit fid.seek(1000000)
%timeit fid.read(1)
print('333*100*100/10**(9)*1000=%.2f ms'%(333*100*100/10**(9)*1000))


100000 loops, best of 3: 3.71 us per loop
1000000 loops, best of 3: 562 ns per loop
1000000 loops, best of 3: 330 ns per loop
333*100*100/10**(9)*1000=3.33 ms

从底部的计算可以看出，我们有一个读取1字节*10000字节（100x100子映像）*333ns/字节=3.33ms，这与上面的crop命令相同

虽然在一个线程中不能比PIL crop快很多，但可以使用多个核来加快速度！：）

我在我的8核i7机器以及我7岁的双核勉强2ghz笔记本电脑上运行了以下代码。两者在运行时方面都有显著的改进。正如您所期望的，改进取决于可用的内核数量

代码的核心是相同的，我只是将循环与实际计算分离，以便函数可以并行应用于值列表

那么这个,

for i in range(0,num_images):
    t = time.time()

    im = Image.open('%03i.png'%i)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))

    print 'Time to open: %.4f seconds'%(time.time()-t)

    #convert them to numpy arrays
    data = np.array(imc)

成为：

def convert(filename):  
    im = Image.open(filename)
    w,h = im.size
    imc = im.crop((w-50,h-50,w+50,h+50))
    return numpy.array(imc)

加速的关键是

多处理

库的

池

功能。它使得跨多个处理器运行东西变得很简单

完整代码：非常基本。只需要几行额外的代码，再进行一些重构，将转换位移到自己的函数中。结果不言而喻：

结果： 8芯i7 双核英特尔好了！即使你有一台超旧的双核机器，你也可以将打开和处理图像的时间减半

警告记忆。如果你正在处理1000个图像，你可能会在某个时候弹出Pythons内存限制。要解决这个问题，您只需将数据分块处理。您仍然可以利用所有的多处理优势，只需较小的部分。比如：

for i in range(0, len(images), chunk_size): 
    results = pool.map(convert, images[i : i+chunk_size]) 
    # rest of code.

将文件另存为未压缩的24位BMP。它们以非常规则的方式存储像素数据。从中查看此图表的“图像数据”部分。请注意，图中的大部分复杂性仅来自标题：

例如，假设您正在存储此图像（此处显示放大）：

这就是像素数据部分的外观，如果它存储为24位未压缩的BMP。请注意，由于某些原因，数据是以自底向上的方式存储的，并且是以BGR格式而不是RGB格式存储的，因此文件中的第一行是图像的最底行，第二行是第二行，以此类推：

00 00 FF    FF FF FF    00 00
FF 00 00    00 FF 00    00 00

该数据解释如下：

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  00 00 FF      |  FF FF FF       |  00 00
-----------+----------------+-----------------+-----------
First Row  |  FF 00 00      |  00 FF 00       |  00 00
-----------+----------------+-----------------+-----------

或：

填充用于将行大小填充为4字节的倍数

因此，您所要做的就是为这种特定的文件格式实现一个读取器，然后计算您必须开始和停止读取每一行的位置的字节偏移量：

def calc_bytes_per_row(width, bytes_per_pixel):
    res = width * bytes_per_pixel
    if res % 4 != 0:
        res += 4 - res % 4
    return res

def calc_row_offsets(pixel_array_offset, bmp_width, bmp_height, x, y, row_width):
    if x + row_width > bmp_width:
        raise ValueError("This is only for calculating offsets within a row")

    bytes_per_row = calc_bytes_per_row(bmp_width, 3)
    whole_row_offset = pixel_array_offset + bytes_per_row * (bmp_height - y - 1)
    start_row_offset = whole_row_offset + x * 3
    end_row_offset = start_row_offset + row_width * 3
    return (start_row_offset, end_row_offset)

然后，您只需处理正确的字节偏移量。例如，假设要读取10000x1000位图中从500x500位置开始的400x400块：

def process_row_bytes(row_bytes):
    ... some efficient way to process the bytes ...

bmpf = open(..., "rb")
pixel_array_offset = ... extract from bmp header ...
bmp_width = 10000
bmp_height = 10000
start_x = 500
start_y = 500
end_x = 500 + 400
end_y = 500 + 400

for cur_y in xrange(start_y, end_y):
    start, end = calc_row_offsets(pixel_array_offset, 
                                  bmp_width, bmp_height, 
                                  start_x, cur_y, 
                                  end_x - start_x)
    bmpf.seek(start)
    cur_row_bytes = bmpf.read(end - start)
    process_row_bytes(cur_row_bytes)

请注意，如何处理字节很重要。你也许可以用PIL做一些聪明的事情，只是把像素数据倒进去，但我不完全确定。如果你以一种低效的方式去做，那么它可能不值得。如果速度是一个巨大的关注点，你可能会考虑在C中编写或实现上面的代码，只是用Python调用它。

< P>哦，我意识到，有一种远比我在BMP文件上面写的简单得多的方法。p> 如果仍要生成图像文件，并且始终知道要读取的部分，则在生成图像文件时，只需将该部分另存为另一个图像文件即可：

import numpy as np
import matplotlib.pyplot as plt
import Image

#Generate sample images
num_images = 5

for i in range(0,num_images):
    Z = np.random.rand(2000, 2000)
    plt.imsave('%03i.png'%i, Z)
    snipZ = Z[200:300, 200:300]
    plt.imsave('%03i.snip.png'%i, snipZ)

#load the images
for i in range(0,num_images):
    im = Image.open('%03i.snip.png'%i)

    #convert them to numpy arrays
    data = np.array(im)

我很确定你不能，但我希望在这一点上被证明是错误的。你必须将文件作为原始二进制文件打开，然后使用file.seek（）等来访问你want@avrono是的，但问题是如何分辨哪些位构成图像的中心（不考虑图像尺寸）对于至少一种图像类型来说，查找特定字节更为复杂，因为看起来他使用的是zlib压缩的png。png是位图吗，我不这么认为。它是压缩的，所以你必须先做点什么，然后才能得到比特。你的图像可能是位图吗？好的，很高兴有一些独立的确认，我已经在当地的速度最大值。如果没有其他解决办法，我会在几天后接受这个答案。谢谢-1、这不是一个好的比较。在

im.crop

点，图像已加载

im.crop

just-这实际上是一个不可操作的选项。一个公平的比较是加载整个图像，然后裁剪，然后转换为一个数组，而不是只读取相关字节，然后将它们转换为一个数组。哦，这真的很有趣。我原以为我会受到磁盘读取速率的限制，但这似乎使我实际上受到了裁剪功能的限制？这是一个很好的实现加速的实用方法。谢谢@DanHickstein衡量一切：）我在工作中有一个与你非常相似的脚本（例如，open/crop/process）。我确信瓶颈实际上是从磁盘上加载图像（因为有成千上万的图像）。然而，在与kernprof进行了一次快速连线后，我意识到，从光盘上读取图像的速度可能会快得离谱（至少在我的设置中是如此）。也就是说，如果您发现读取IO仍然存在问题，您可以使用

多处理.dummy.Pool

轻松地将IO拆分到多个线程中。啊，是的，这是一个很好的选择

           |  First column  |  Second Column  |  Padding
-----------+----------------+-----------------+-----------
Second Row |  red           |  white          |  00 00
-----------+----------------+-----------------+-----------
First Row  |  blue          |  green          |  00 00
-----------+----------------+-----------------+-----------

def calc_bytes_per_row(width, bytes_per_pixel):
    res = width * bytes_per_pixel
    if res % 4 != 0:
        res += 4 - res % 4
    return res

def calc_row_offsets(pixel_array_offset, bmp_width, bmp_height, x, y, row_width):
    if x + row_width > bmp_width:
        raise ValueError("This is only for calculating offsets within a row")

    bytes_per_row = calc_bytes_per_row(bmp_width, 3)
    whole_row_offset = pixel_array_offset + bytes_per_row * (bmp_height - y - 1)
    start_row_offset = whole_row_offset + x * 3
    end_row_offset = start_row_offset + row_width * 3
    return (start_row_offset, end_row_offset)

def process_row_bytes(row_bytes):
    ... some efficient way to process the bytes ...

bmpf = open(..., "rb")
pixel_array_offset = ... extract from bmp header ...
bmp_width = 10000
bmp_height = 10000
start_x = 500
start_y = 500
end_x = 500 + 400
end_y = 500 + 400

for cur_y in xrange(start_y, end_y):
    start, end = calc_row_offsets(pixel_array_offset, 
                                  bmp_width, bmp_height, 
                                  start_x, cur_y, 
                                  end_x - start_x)
    bmpf.seek(start)
    cur_row_bytes = bmpf.read(end - start)
    process_row_bytes(cur_row_bytes)

import numpy as np
import matplotlib.pyplot as plt
import Image

#Generate sample images
num_images = 5

for i in range(0,num_images):
    Z = np.random.rand(2000, 2000)
    plt.imsave('%03i.png'%i, Z)
    snipZ = Z[200:300, 200:300]
    plt.imsave('%03i.snip.png'%i, snipZ)

#load the images
for i in range(0,num_images):
    im = Image.open('%03i.snip.png'%i)

    #convert them to numpy arrays
    data = np.array(im)