Python 提高阅读文本的可靠性_Python_Opencv_Image Processing_Ocr_Python Tesseract

Python 提高阅读文本的可靠性

python opencv image-processing

Python 提高阅读文本的可靠性,python,opencv,image-processing,ocr,python-tesseract,Python,Opencv,Image Processing,Ocr,Python Tesseract,我试图从屏幕截图中读取相对清晰的数字，但在让pytesseract正确读取文本时遇到了问题。我有以下截图：我知道比分（2-0）和时钟（1:42）将处于完全相同的位置这是我目前用于读取时钟时间和橙色分数的代码： lower_orange = np.array([0, 90, 200], dtype = "uint8") upper_orange = np.array([70, 160, 255], dtype = "uint8") #Isolate scoreboard locat

我试图从屏幕截图中读取相对清晰的数字，但在让pytesseract正确读取文本时遇到了问题。我有以下截图：

我知道比分（2-0）和时钟（1:42）将处于完全相同的位置

这是我目前用于读取时钟时间和橙色分数的代码：

lower_orange = np.array([0, 90, 200], dtype = "uint8")
upper_orange = np.array([70, 160, 255], dtype = "uint8")

    #Isolate scoreboard location on a 1080p pic
    clock = input[70:120, 920:1000]
    scoreboard = input[70:150, 800:1120]

    #greyscale
    roi_gray = cv2.cvtColor(clock, cv2.COLOR_BGR2GRAY)

    config = ("-l eng -c tessedit_char_whitelist=0123456789: --oem 1 --psm 8")
    time = pytesseract.image_to_string(roi_gray, config=config)
    print("time is " + time)

    # find the colors within the specified boundaries and apply
    # the mask
    mask_orange = cv2.inRange(scoreboard, lower_orange, upper_orange)

    # find contours in the thresholded image, then initialize the
    # list of digit locations
    cnts = cv2.findContours(mask_orange.copy(), cv2.RETR_EXTERNAL,
                            cv2.CHAIN_APPROX_SIMPLE)
    cnts = imutils.grab_contours(cnts)
    locs = []

    for (i, c) in enumerate(cnts):
        # compute the bounding box of the contour, then use the
        # bounding box coordinates to derive the aspect ratio
        (x, y, w, h) = cv2.boundingRect(c)
        ar = w / float(h)

        # since score will be a fixed size of about 25 x 35, we'll set the area at about 300 to be safe
        if w*h > 300:
            orange_score_img = mask_orange[y-5:y+h+5, x-5:x+w+5]
            orange_score_img = cv2.GaussianBlur(orange_score_img, (5, 5), 0)

            config = ("-l eng -c tessedit_char_whitelist=012345 --oem 1 --psm 10")
            orange_score = pytesseract.image_to_string(orange_score_img, config=config)
            print("orange_score is " + orange_score)

以下是输出：

time is 1:42
orange_score is

这是橙色分数img，在我屏蔽了橙色上下限范围内的所有内容并应用了高斯模糊后

然而在这一点上，甚至当我配置PyteSeract搜索1个字符并限制白名单时，我仍然无法让它正确读取。是否有一些我缺少的额外后处理来帮助pytesseract将这个数字读取为2？

根据@fmw42的建议，我尝试了一些形态学变化。加厚数字似乎起到了作用

kernel=np.one（（5,5），np.uint8）
orange\u score\u img=cv2.放大（orange\u score\u img，内核，迭代次数=1）

编辑：我意识到，真正的答案是pytesseract在白色背景上使用黑色文本比在黑色背景上使用白色文本要好得多！当我反转颜色时，它读起来非常完美：

orange\u score\u img=cv2.按位\u not（orange\u score\u img）

我希望这对刚开始使用pytesseract的人有所帮助！尝试调整图像以适应我的所有情况是令人难以置信的沮丧，而且知道黑白文本效果更好会节省我几个小时…

也许你需要先将“2”图像阈值设置为二进制，如果需要，可以打开一些形态学来加厚它。谢谢@fmw42，我采纳了你的加厚建议，这非常有帮助！