Python cv2.1矩形连接最近的边界框_Python_Opencv_Bounding Box

Python cv2.1矩形连接最近的边界框

python opencv

Python cv2.1矩形连接最近的边界框,python,opencv,bounding-box,Python,Opencv,Bounding Box,我正试图把中世纪的手稿中的单词从扫描页中分离出来。我使用cv2来检测区域，ant id给了我一个非常令人满意的结果。我用递增数字标记每个矩形，我担心检测到的区域不是连续的：以下是我使用的代码： import numpy as np import cv2 import matplotlib.pyplot as plt # This is font for labels font = cv2.FONT_HERSHEY_SIMPLEX # I load a picture of a page,

我正试图把中世纪的手稿中的单词从扫描页中分离出来。我使用cv2来检测区域，ant id给了我一个非常令人满意的结果。我用递增数字标记每个矩形，我担心检测到的区域不是连续的：

以下是我使用的代码：

import numpy as np
import cv2
import matplotlib.pyplot as plt
# This is font for labels
font = cv2.FONT_HERSHEY_SIMPLEX
# I load a picture of a page, gray and blur it
im = cv2.imread('test.png')
imgray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
image_blurred = cv2.GaussianBlur(imgray, (5, 5), 0)
image_blurred = cv2.dilate(image_blurred, None)
ret,thresh = cv2.threshold(image_blurred,0,255,0,cv2.THRESH_BINARY + cv2.THRESH_OTSU)
# I try to retrieve contours and hierarchy on the sample
_, contours, hierarchy =    cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
hierarchy = hierarchy[0]
# I read every contours and retrieve the bounding box 
for i,component in enumerate(zip(contours, hierarchy)):
    cnt = component[0]
    currentHierarchy = component[1]
    precision = 0.01
    epsilon = precision*cv2.arcLength(cnt,True)
    approx = cv2.approxPolyDP(cnt,epsilon,True)
    # This is the best combination I found to isolate parents container
    # It gives me the best result (even if I'm not sure what I'm doing)
    # hierarchy[2/3] is "having child" / "having parent"
    # I thought  currentHierarchy[3] < 0 should be better
    # but it gives no result
    if currentHierarchy[2] > 0 and currentHierarchy[3] > 0:
        x,y,w,h = cv2.boundingRect(approx)
        cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)
        cv2.putText(im,str(i),(x+2,y+2), font, 1,(0,255,0),2,cv2.LINE_AA)

plt.imshow(im)
plt.show()

将numpy导入为np
进口cv2
将matplotlib.pyplot作为plt导入
#这是标签的字体
font=cv2.font\u HERSHEY\u SIMPLEX
#我加载一张页面的图片，将其置为灰色并模糊
im=cv2.imread（'test.png'）
imgray=cv2.CVT颜色（im，cv2.COLOR\U BGR2GRAY）
图像_模糊=cv2.高斯模糊（imgray，（5,5,0）
图像模糊=cv2.放大（图像模糊，无）
ret，thresh=cv2.阈值（图像模糊，0255,0，cv2.thresh\u二进制+cv2.thresh\u大津）
#我尝试检索样本上的轮廓和层次
_，等高线，层次=cv2.findContours（阈值，cv2.RETR\u树，cv2.CHAIN\u近似值\u简单）
层次结构=层次结构[0]
#我读取每个轮廓并检索边界框
对于i，枚举中的组件（zip（等高线，层次结构））：
cnt=成分[0]
currentHierarchy=组件[1]
精度=0.01
ε=精度*cv2.弧长（cnt，真）
近似=cv2.approxPolyDP（cnt，ε，真）
#这是我发现的最好的分离父容器的组合
#它给了我最好的结果（即使我不确定自己在做什么）
#层次结构[2/3]为“有子”/“有父”
#我认为currentHierarchy[3]<0应该更好
#但没有结果
如果currentHierarchy[2]>0且currentHierarchy[3]>0：
x、 y，w，h=cv2.边界矩形（近似值）
cv2.矩形（im，（x，y），（x+w，y+h），（0255,0），2）
cv2.putText（im，str（i），（x+2，y+2），字体，1，（0255,0），2，cv2.LINE_AA）
plt.imshow（im）
plt.show（）

我想将最近的区域连接在一起，以便获得我的页面的单词标记化。在我的示例图片中，我想加入2835、2847、2864、2878、2870和2868

我该怎么办？我以为我可以将每个盒子的每个坐标存储在一个数组中，然后进行测试（start_x，start_y）和（end_x，end_y）-但我觉得这很糟糕

你能给我一个提示吗

谢谢，

我继续用我的方法来找出每个单词。虽然不完全准确，但请看下图：

伪代码：

将高斯模糊应用于灰度图像

执行大津的阈值

执行了几项形态学操作：

3.1尝试删除图像左上方的细线

3.2扩展以连接因上一次操作而分离的单个字母

找到某个区域上方的等高线并进行标记

编辑

代码：