Python 分析图像中每个子窗口的更快方法？_Python_Performance_Opencv_Image Processing

Python 分析图像中每个子窗口的更快方法？

python performance opencv image-processing

Python 分析图像中每个子窗口的更快方法？,python,performance,opencv,image-processing,Python,Performance,Opencv,Image Processing,我试图计算图像中子窗口的熵特征。以下是我编写的代码： def genHist(img): hist = np.histogram(img, np.arange(0, 256), normed=True) return hist[0] def calcEntropy(hist): logs = np.nan_to_num(np.log2(hist)) hist_loghist = hist * logs entropy = -1 * hist_lo

我试图计算图像中子窗口的熵特征。以下是我编写的代码：

  def genHist(img):
    hist = np.histogram(img, np.arange(0, 256), normed=True)
    return hist[0]

  def calcEntropy(hist):
    logs = np.nan_to_num(np.log2(hist))
    hist_loghist = hist * logs
    entropy = -1 * hist_loghist.sum()
    return entropy  

   img = cv2.imread("lena.jpg", 0)
   result = np.zeros(img.shape, dtype=np.float16)
   h, w = img.shape
   subwin_size = 5
   for y in xrange(subwin_size, h-subwin_size):
       for x in xrange(subwin_size, w-subwin_size):
           subwin = img[y-subwin_size:y+subwin_size, x-subwin_size:x+subwin_size]
           hist = genHist(subwin)         # Generate histogram
           entropy = calcEntropy(hist)    # Calculate entropy
           result[y, x] = entropy

事实上，它是有效的。但问题是它的速度太慢了。

你有什么办法使它更快吗？

你可以做一些修改使它更快

您的代码在我的笔记本电脑中花费了以下时间：

IPython CPU timings (estimated):
  User   :      50.92 s.
  System :       0.01 s.
Wall time:      51.20 s.

我做了以下修改：

1-删除函数

genHist

，并在

calcEntropy（）

中实现它。它将保存，可能是1秒或2秒

2-在查找日志之前，我只向hist添加了一个小值0.00001，而不是

logs=np.nan\u to_num（np.log2（hist））

<代码>日志=np.log2（hist+0.00001）。它将节省

3-4秒

，但会稍微改变您的输出。两个结果之间的最大误差是

0.0039062

。（因此，你是否想要这个取决于你自己）

3-将

np.histogram

更改为

cv2.calcHist（）

它将节省超过25秒的时间

现在，代码在我的笔记本电脑上花费了以下时间：

IPython CPU timings (estimated):
  User   :      13.38 s.
  System :       0.00 s.
Wall time:      13.41 s.

速度提高了3倍以上

代码：

def calcEntropy(img):
    #hist,_ = np.histogram(img, np.arange(0, 256), normed=True)
    hist = cv2.calcHist([img],[0],None,[256],[0,256])
    hist = hist.ravel()/hist.sum()
    #logs = np.nan_to_num(np.log2(hist))
    logs = np.log2(hist+0.00001)
    #hist_loghist = hist * logs
    entropy = -1 * (hist*logs).sum()
    return entropy  

img = cv2.imread("lena.jpg", 0)
result2 = np.zeros(img.shape, dtype=np.float16)
h, w = img.shape
subwin_size = 5
for y in xrange(subwin_size, h-subwin_size):
   for x in xrange(subwin_size, w-subwin_size):
       subwin = img[y-subwin_size:y+subwin_size, x-subwin_size:x+subwin_size]
       #hist = genHist(subwin)         # Generate histogram
       entropy = calcEntropy(subwin)    # Calculate entropy
       result2.itemset(y,x,entropy)

现在的主要问题是
两个for循环
。我认为它是实现
Cython
的最佳选择，它将产生非常好的效果。

作为第一步，您应该尝试使用

math.log

而不是相应的

numpy

函数，这要慢得多：

import numpy as np
import math

x=abs(randn(1000000))

#unsing numpy
start = time.time()
for i in x:
    np.log2(i)
print "Runtime: %f s" % (time.time()-start)
>>> Runtime: 3.653858 s

#using math.log
start = time.time()
for i in x:
    math.log(i,2)        # use log with base 2
print "Runtime: %f s" % (time.time()-start)
>>> Runtime: 0.692702 s

问题是，

math.log

遇到的每一个

都会产生一个错误。通过从Historogram输出中删除所有

，可以绕过此问题。这有几个好处：1）数学。日志不会失败，2）根据您的映像，

math.log

将被调用得更少，这将导致更快的代码。您可以删除零，因为

0*log（0）

变为

，即使

log（0）

将返回一个值。所以，乘积不加在熵的和上

我在一些音频处理方面也遇到了同样的问题。不幸的是，我无法超越上述改进。如果你找到了更好的解决方案，如果你能在这里发布，我将非常高兴。

numpy.log2

比

math.log

快（在我的电脑上大约快20倍！），如果你使用得当：不要在循环中为每个数字调用

numpy.log

，而是为整个数组调用一次（在问题中是这样做的）。只需将x:np.log2（i）中i的

部分替换为np.log2（x）
，然后自己看看。与Python循环的循环开销相比，在现代PC上计算浮点数的对数几乎不需要时间。非常感谢您的帮助，阿比德！自从我开始学习Python+OpenCV以来，我一直在读你的博客。我还将尝试Cython实现。您是否尝试过使用sklearn函数“提取补丁（图像、窗口大小等）”呢？它可以根据给定窗口大小（subwin_size）的输入图像计算子窗口（subwin）。它应该比您的实现更快（尽管我没有比较它们）。