OpenCV-从字典生成直方图的Python单词包（BoW）_Python_Opencv

OpenCV-从字典生成直方图的Python单词包（BoW）

python opencv

OpenCV-从字典生成直方图的Python单词包（BoW）,python,opencv,Python,Opencv,我一直在尝试使用keypoints和bag of words技术在Python OpenCV 3.2.0中创建一个图像分类器。读了一些书后，我发现我可以这样理解使用AKAZE提取图像描述符对描述符执行k-means聚类以生成字典基于字典的图像直方图生成基于直方图的支持向量机训练我设法完成了第1步和第2步，但被第3步和第4步卡住了我使用k-means聚类返回的标签成功地生成了直方图（我想）。然而，当我想使用未用于生成字典的新测试数据时，我得到了一些意想不到的结果。我尝试使用像这样的FL

我一直在尝试使用keypoints和bag of words技术在Python OpenCV 3.2.0中创建一个图像分类器。读了一些书后，我发现我可以这样理解

使用AKAZE提取图像描述符

对描述符执行k-means聚类以生成字典

基于字典的图像直方图生成

基于直方图的支持向量机训练

我设法完成了第1步和第2步，但被第3步和第4步卡住了

我使用k-means聚类返回的标签成功地生成了直方图（我想）。然而，当我想使用未用于生成字典的新测试数据时，我得到了一些意想不到的结果。我尝试使用像这样的FLANN匹配器，但是从标签数据生成直方图得到的结果与FLANN匹配返回的数据不匹配

我加载图像：

dictionary_size = 512
# Loading images
imgs_data = []
# imreads returns a list of all images in that directory
imgs = imreads(imgs_path)
for i in xrange(len(imgs)):
    # create a numpy to hold the histogram for each image
    imgs_data.insert(i, np.zeros((dictionary_size, 1)))

然后我创建一个描述符数组（desc）：

然后使用k-均值对描述符进行聚类：

# Clustering
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 0.01)
flags = cv2.KMEANS_PP_CENTERS
# desc is a type32 numpy array of vstacked descriptors
compactness, labels, dictionary = cv2.kmeans(desc, dictionary_size, None, criteria, 1, flags)

然后，我使用从k-means返回的标签为每个图像创建直方图：

# Getting histograms from labels
size = labels.shape[0] * labels.shape[1]
for i in xrange(size):
    label = labels[i]
    # Get this descriptors image id
    img_id = desc_src_img[i]
    # imgs_data is a list of the same size as the number of images
    data = imgs_data[img_id]
    # data is a numpy array of size (dictionary_size, 1) filled with zeros
    data[label] += 1

ax = plt.subplot(311)
ax.set_title("Histogram from labels")
ax.set_xlabel("Visual words")
ax.set_ylabel("Frequency")
ax.plot(imgs_data[0].ravel())

这会输出一个这样的直方图，分布非常均匀，符合我的预期

然后，我尝试在相同的图像上执行相同的操作，但使用FLANN：

matcher = cv2.FlannBasedMatcher_create()
matcher.add(dictionary)
matcher.train()

descriptors = get_descriptors(imgs[0], detector)

result = np.zeros((dictionary_size, 1), np.float32)
# flan matcher needs descriptors to be type32
matches = matcher.match(np.float32(descriptors))
for match in matches:
    visual_word = match.trainIdx
    result[visual_word] += 1

ax = plt.subplot(313)
ax.set_title("Histogram from FLANN")
ax.set_xlabel("Visual words")
ax.set_ylabel("Frequency")
ax.plot(result.ravel())

这将输出这样一个直方图，它的分布非常不均匀，并且与第一个直方图不匹配

您可以在上查看完整的代码和图像。在运行之前，将“imgs_path”（第20行）更改为包含图像的目录

我哪里做错了？为什么直方图如此不同？如何使用字典为新数据生成直方图

作为补充说明，我尝试使用OpenCV BOW实现，但发现了另一个错误：“\u queryDescriptors.type（）==函数cv:：BFMatcher:：knnMatchImpl中的traindescype”，这就是我尝试自己实现它的原因。如果有人能提供一个使用Python OpenCV BOW和AKAZE的工作示例，那就太好了。

似乎您无法在手边使用字典培训FlannBasedMatcher，如下所示：

matcher = cv2.FlannBasedMatcher_create()
matcher.add(dictionary)
matcher.train()

但是，当进行如下匹配时，您可以将

字典

传入：

matcher = cv2.FlannBasedMatcher_create()

...

matches = matcher.match(np.float32(descriptors), dictionary)

我不完全清楚这是为什么。也许是因为

train

方法仅适用于本文中所暗示的

match

方法

此外，根据以下参数，匹配的参数为：

queryDescriptors–描述符的查询集
列车描述符–列车描述符集。此集合不会添加到类对象中存储的列车描述符集合中
匹配–匹配。如果在掩码中屏蔽了查询描述符，则不会为此描述符添加匹配项。因此，匹配大小可能小于查询描述符计数

所以我想你应该把

字典

作为

列车描述符

传入，因为它就是这样

如果有人能对此提供更多的信息，我们将不胜感激

以下是使用上述方法后的结果：

您可以看到完整的更新代码。

请不要链接GitHub存储库，而是在代码块中提供。@lukegv已根据您的请求进行更新。

matcher = cv2.FlannBasedMatcher_create()

...

matches = matcher.match(np.float32(descriptors), dictionary)