使用OpenCV和Python比较图像的相似性_Python_Opencv_Computer Vision

使用OpenCV和Python比较图像的相似性

python opencv computer-vision

使用OpenCV和Python比较图像的相似性,python,opencv,computer-vision,Python,Opencv,Computer Vision,我试图将一幅图像与其他图像列表进行比较，并返回该列表中的一组图像（如谷歌搜索图像），相似度高达70% 我得到了这段代码，并根据我的上下文进行了更改 # Load the images img =cv2.imread(MEDIA_ROOT + "/uploads/imagerecognize/armchair.jpg") # Convert them to grayscale imgg =cv2.cvtColor(img,cv2.COLOR_BGR2GRAY) # SURF extractio

我试图将一幅图像与其他图像列表进行比较，并返回该列表中的一组图像（如谷歌搜索图像），相似度高达70%

我得到了这段代码，并根据我的上下文进行了更改

# Load the images
img =cv2.imread(MEDIA_ROOT + "/uploads/imagerecognize/armchair.jpg")

# Convert them to grayscale
imgg =cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

# SURF extraction
surf = cv2.FeatureDetector_create("SURF")
surfDescriptorExtractor = cv2.DescriptorExtractor_create("SURF")
kp = surf.detect(imgg)
kp, descritors = surfDescriptorExtractor.compute(imgg,kp)

# Setting up samples and responses for kNN
samples = np.array(descritors)
responses = np.arange(len(kp),dtype = np.float32)

# kNN training
knn = cv2.KNearest()
knn.train(samples,responses)

modelImages = [MEDIA_ROOT + "/uploads/imagerecognize/1.jpg", MEDIA_ROOT + "/uploads/imagerecognize/2.jpg", MEDIA_ROOT + "/uploads/imagerecognize/3.jpg"]

for modelImage in modelImages:

    # Now loading a template image and searching for similar keypoints
    template = cv2.imread(modelImage)
    templateg= cv2.cvtColor(template,cv2.COLOR_BGR2GRAY)
    keys = surf.detect(templateg)

    keys,desc = surfDescriptorExtractor.compute(templateg, keys)

    for h,des in enumerate(desc):
        des = np.array(des,np.float32).reshape((1,128))

        retval, results, neigh_resp, dists = knn.find_nearest(des,1)
        res,dist =  int(results[0][0]),dists[0][0]


        if dist<0.1: # draw matched keypoints in red color
            color = (0,0,255)

        else:  # draw unmatched in blue color
            #print dist
            color = (255,0,0)

        #Draw matched key points on original image
        x,y = kp[res].pt
        center = (int(x),int(y))
        cv2.circle(img,center,2,color,-1)

        #Draw matched key points on template image
        x,y = keys[h].pt
        center = (int(x),int(y))
        cv2.circle(template,center,2,color,-1)



    cv2.imshow('img',img)
    cv2.imshow('tm',template)
    cv2.waitKey(0)
    cv2.destroyAllWindows()

#加载图像
img=cv2.imread（MEDIA_ROOT+“/uploads/imagerecognize/armchair.jpg”）
#将它们转换为灰度
imgg=cv2.cvt颜色（img，cv2.COLOR\u bgr2灰色）
#表面提取
surf=cv2。特征检测器_创建（“surf”）
surfDescriptorExtractor=cv2.DescriptorExtractor\u创建（“SURF”）
kp=冲浪检测（imgg）
kp，描述符=surfDescriptorExtractor.compute（imgg，kp）
#为kNN设置样本和响应
样本=np.数组（描述符）
响应=np.arange（len（kp），dtype=np.float32）
#kNN培训
knn=cv2.KNearest（）
knn.序列（样本、响应）
modelmages=[MEDIA_ROOT+”/uploads/imagerecognize/1.jpg“，MEDIA_ROOT+”/uploads/imagerecognize/2.jpg”，MEDIA_ROOT+“/uploads/imagerecognize/3.jpg”]
对于modelImage中的modelImage：
#现在加载模板图像并搜索类似的关键点
模板=cv2.imread（modelImage）
templateg=cv2.cvt颜色（模板，cv2.COLOR\u bgr2灰色）
keys=冲浪检测（templateg）
keys，desc=surfDescriptorExtractor.compute（templateg，keys）
对于h，枚举中的des（描述）：
des=np.array（des，np.float32）。重塑（（1128））
检索，结果，相邻，距离=knn。查找最近的（des，1）
res，dist=int（结果[0][0]），dists[0][0]
如果距离我建议您查看图像之间的土方工程距离（EMD）。
这个度量给出了将一个标准化的灰度图像转换成另一个图像有多困难的感觉，但可以推广到彩色图像。下面的文章对这种方法进行了很好的分析：

它可以在整个图像和直方图上进行（这比整个图像的方法要快）。我不确定哪种方法允许进行完整的图像比较，但对于直方图比较，可以使用cv.CalcEMD2函数
唯一的问题是，这种方法没有定义相似度的百分比，而是定义一个可以过滤的距离
我知道这不是一个完整的算法，但仍然是它的基础，所以我希望它能有所帮助
编辑：
以下是EMD原则上的工作原理。其主要思想是有两个标准化矩阵（两个灰度图像除以它们的和），并定义一个通量矩阵，描述如何将灰度从第一个图像的一个像素移动到另一个像素以获得第二个图像（即使对于非标准化图像也可以定义，但更困难）
用数学术语来说，流量矩阵实际上是一个四维张量，它给出了从旧图像的点（i，j）到新图像的点（k，l）的流量，但是如果你将图像展平，你可以将其转换为一个标准矩阵，只是稍微难读一点
此流矩阵有三个约束条件：每个项应为正，每行的和应返回指定像素的相同值，每列的和应返回起始像素的值
考虑到这一点，您必须最小化转换的成本，即（i，j）到（k，l）之间距离的（i，j）到（k，l）的每个流的乘积之和
它的文字看起来有点复杂，所以下面是测试代码。逻辑是正确的，我不知道为什么scipy解算器会对此抱怨（您应该看看openOpt或类似的东西）：
变量res包含最小化的结果……但正如我所说的，我不确定它为什么抱怨奇异矩阵
该算法的唯一问题是速度不是很快，因此不可能按需执行，但在创建数据集时必须耐心执行，并将结果存储在某个地方
您正在着手解决一个巨大的问题，称为“基于内容的图像检索”（CBIR）。这是一个巨大而活跃的领域。目前还没有完成的算法或标准方法，尽管有很多技术都有不同程度的成功
甚至谷歌图像搜索也没有做到这一点（目前）-他们做基于文本的图像搜索-例如，在页面中搜索与您搜索的文本相似的文本。（我确信他们正在研究使用CBIR；它是许多图像处理研究人员的圣杯）
如果你有一个紧迫的最后期限或需要尽快完成这项工作。。。哎呀
这里有大量关于这个主题的论文：

通常，您需要做几件事：
提取特征（在局部兴趣点，或全局，或以某种方式，筛选、浏览、直方图等）
聚类/构建图像分布模型
这可能涉及。等等。
大概两年前，我用Python/Cython编写了一个程序来做一些非常类似的事情。后来我重写了它，以获得更好的性能。基本想法来自IIRC
它基本上为每个图像计算一个“指纹”，然后比较这些指纹以匹配相似的图像
指纹是通过将图像大小调整为160x160，将其转换为灰度，添加一些模糊，对其进行规格化，然后将其大小调整为16x16单色来生成的。最后你有256位的输出：那是你的指纹。这很容易通过以下方式实现：
（路径[0]

中的

[0]

仅用于提取动画GIF的第一帧；如果您对此类图像不感兴趣，可以将其删除。）

将此应用于2幅图像后，您将拥有2个（256位）指纹，

fp1

和

fp2

然后，通过对这两个值进行异或运算并对设置为1的位进行计数，计算这两个图像的相似性分数。要执行此位计数，您可以从以下位置使用

bitsoncount（）

函数：

score

将是一个介于0和256之间的数字，表示h

#original data, two 2x2 images, normalized
x = rand(2,2)
x/=sum(x)
y = rand(2,2)
y/=sum(y)

#initial guess of the flux matrix
# just the product of the image x as row for the image y as column
#This is a working flux, but is not an optimal one
F = (y.flatten()*x.flatten().reshape((y.size,-1))).flatten()

#distance matrix, based on euclidean distance
row_x,col_x = meshgrid(range(x.shape[0]),range(x.shape[1]))
row_y,col_y = meshgrid(range(y.shape[0]),range(y.shape[1]))
rows = ((row_x.flatten().reshape((row_x.size,-1)) - row_y.flatten().reshape((-1,row_x.size)))**2)
cols = ((col_x.flatten().reshape((row_x.size,-1)) - col_y.flatten().reshape((-1,row_x.size)))**2)
D = np.sqrt(rows+cols)

D = D.flatten()
x = x.flatten()
y = y.flatten()
#COST=sum(F*D)

#cost function
fun = lambda F: sum(F*D)
jac = lambda F: D
#array of constraint
#the constraint of sum one is implicit given the later constraints
cons  = []
#each row and columns should sum to the value of the start and destination array
cons += [ {'type': 'eq', 'fun': lambda F:  sum(F.reshape((x.size,y.size))[i,:])-x[i]}     for i in range(x.size) ]
cons += [ {'type': 'eq', 'fun': lambda F:  sum(F.reshape((x.size,y.size))[:,i])-y[i]} for i in range(y.size) ]
#the values of F should be positive
bnds = (0, None)*F.size

from scipy.optimize import minimize
res = minimize(fun=fun, x0=F, method='SLSQP', jac=jac, bounds=bnds, constraints=cons)

convert path[0] -sample 160x160! -modulate 100,0 -blur 3x99 \
    -normalize -equalize -sample 16x16 -threshold 50% -monochrome mono:-

# fp1 and fp2 are stored as lists of 8 (32-bit) integers
score = 0
for n in range(8):
    score += bitsoncount(fp1[n] ^ fp2[n])

from keras.preprocessing.image import load_img, img_to_array
from scipy.stats import wasserstein_distance
import numpy as np

def get_histogram(img):
  '''
  Get the histogram of an image. For an 8-bit, grayscale image, the
  histogram will be a 256 unit vector in which the nth value indicates
  the percent of the pixels in the image with the given darkness level.
  The histogram's values sum to 1.
  '''
  h, w = img.shape[:2]
  hist = [0.0] * 256
  for i in range(h):
    for j in range(w):
      hist[img[i, j]] += 1
  return np.array(hist) / (h * w)

a = img_to_array(load_img('a.jpg', grayscale=True))
b = img_to_array(load_img('b.jpg', grayscale=True))
a_hist = get_histogram(a)
b_hist = get_histogram(b)
dist = wasserstein_distance(a_hist, b_hist)
print(dist)