Warning: file_get_contents(/data/phpspider/zhask/data//catemap/9/opencv/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python opencv段连接字符_Python_Opencv_Image Processing_Ocr_Tesseract - Fatal编程技术网

Python opencv段连接字符

Python opencv段连接字符,python,opencv,image-processing,ocr,tesseract,Python,Opencv,Image Processing,Ocr,Tesseract,我正在尝试对手写字母进行字符检测 在认识到自己之后,我使用了tesseract或opencv SVM,到目前为止效果很好 在我找到那些连接的字母之前,分段字母的一切都很好 我使用以下代码对字母进行分段: # -*- coding: utf-8 -*- import numpy as np import cv2 # from matplotlib import pyplot as plt from os.path import dirname, join, basename import sys

我正在尝试对手写字母进行字符检测

在认识到自己之后,我使用了tesseract或opencv SVM,到目前为止效果很好

在我找到那些连接的字母之前,分段字母的一切都很好

我使用以下代码对字母进行分段:

# -*- coding: utf-8 -*-
import numpy as np
import cv2
# from matplotlib import pyplot as plt
from os.path import dirname, join, basename
import sys
from glob import glob

trainpic=[]
targetdir = dirname(__file__)+'tmporigin'
#print glob(join(dirname(__file__)+'/cat','*.jpg'))
img = {}

debug = True
a_num = 0
for fn in glob(join(targetdir, '*')):
    filename = basename(fn)
    trainpic.append(cv2.imread(fn, 0))
    img_rgb = cv2.imread(fn)
    img = cv2.imread(fn, 0)
    image_close = cv2.morphologyEx(img_rgb, cv2.MORPH_CLOSE, np.ones((1, 7), np.uint8))
    #if debug:
    #    cv2.imshow('morphology', image_close)
    #    key = cv2.waitKey(0)
    _, contours, hierarchy = cv2.findContours(img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    samples = np.empty((0, 100))
    responses = []
    # keys = [i for i in range(48, 58)]
    tmp_list = []
    tmpcount = 0
    for cnt in contours:
        print 'contourarea:%s' % cv2.contourArea(cnt)
        if cv2.contourArea(cnt) > 130:  # 50 300
            [x, y, w, h] = cv2.boundingRect(cnt)
            print 'boundingRect width:%s' % w
            print 'boundingRect height:%s' % h
            if h > 28:
                cv2.rectangle(img_rgb, (x, y), (x+w, y+h), (0, 0, 255), 2)
                roi = img[y:y+h, x:x+w]
                roismall = cv2.resize(roi, (45, 55))
                if debug:
                    cv2.imshow('norm', img_rgb)
                    key = cv2.waitKey(0)
                # tmp_list.append(roi)
                tmpfilename = fn if tmpcount == 0 else fn.rsplit('.', 1)[0] + '_' + str(tmpcount) + '.png'
                cv2.imwrite(tmpfilename, roismall)
                tmpcount += 1
        else:
            print 'contarea less, skip...'

    #    print img[num].shape
    a_num += 1

print '%s images processed' % a_num
所以,像这个家伙一样用空格来处理字母是很好的(分为D和B):

但是,无法按如下方式分割连接的字母:

我在谷歌上搜索了很多相关信件,并找到了如下两个相关链接:

我尝试了很多,例如形态扩张、侵蚀、开放、关闭、分水岭等,但没有解决我的问题

我在Ubuntu桌面上使用opencv 3.2.0和python 2.7.10

任何建议都非常感谢

谢谢


韦斯利

你读了吗?可能会提供一些线索(特别是4.1)。@DanMašek是的,我读了文件,但没有得到更多有用的信息,文件中的文字太笼统了。我正在努力阅读tesseract的源代码,希望能找到一些东西