Python 如何检测和识别应答键中的数字，如给定图像？_Python_Image_Opencv_Image Processing_Deep Learning

Python 如何检测和识别应答键中的数字，如给定图像？

python image opencv image-processing deep-learning

Python 如何检测和识别应答键中的数字，如给定图像？,python,image,opencv,image-processing,deep-learning,Python,Image,Opencv,Image Processing,Deep Learning,我有这个图像，我必须检测并存储答案和相应的问题编号。我尝试使用OCR，但它无法正确识别任何东西。还有别的办法吗 import cv2 from imutils import contours import numpy as np import pytesseract config = '-l eng+equ --oem 3 --psm 8' # Load image, grayscale, and adaptive threshold image = cv2.imread('answerke

我有这个图像，我必须检测并存储答案和相应的问题编号。我尝试使用OCR，但它无法正确识别任何东西。还有别的办法吗

import cv2
from imutils import contours
import numpy as np
import pytesseract

config = '-l eng+equ --oem 3 --psm 8'

# Load image, grayscale, and adaptive threshold
image = cv2.imread('answerkey.png')
original = image.copy()
original1= image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
cv2.imshow("thresh",thresh)
cv2.waitKey(0)

# Filter out all numbers and noise to isolate only boxes
cnts = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

(cnts,_) = contours.sort_contours(cnts, method="top-to-bottom")
(cnts,_) = contours.sort_contours(cnts, method="left-to-right")
print(cnts)
completetext=[]
for c in cnts:
    area = cv2.contourArea(c)
    if 500 < area <5000:
        # cv2.imshow("cnt", original)
        # cv2.waitKey(0)
        x,y,w,h = cv2.boundingRect(c)
        crop = original1[y:y+h, x:x+w]
        gray1 = cv2.cvtColor(crop, cv2.COLOR_BGR2GRAY)
        # cv2.imshow("cropp",crop)
        # cv2.waitKey(0)
        kernel = np.zeros((2,2), np.uint8)
        erode = cv2.erode(gray1, kernel,iterations=2)

        cv2.imshow("cropp erode", erode)
        cv2.waitKey(0)
        text = pytesseract.image_to_string(erode, config=config)
        print(text)

导入cv2
从imutils导入等高线
将numpy作为np导入
导入pytesseract
配置='-l eng+equ--oem 3--psm 8'
#加载图像、灰度和自适应阈值
image=cv2.imread（'answerkey.png'）
original=image.copy（）
original1=image.copy（）
灰色=cv2.CVT颜色（图像，cv2.COLOR\u BGR2GRAY）
thresh=cv2.阈值（灰色，0，255，cv2.thresh_BINARY_INV | cv2.thresh_OTSU）[1]
cv2.imshow（“脱粒”，脱粒）
cv2.等待键（0）
#过滤掉所有数字和噪音，仅隔离方框
cnts=cv2.查找对象（阈值、cv2.RETR\u树、cv2.链近似值、简单值）
如果len（cnts）==2个其他cnts[1]，则cnts=cnts[0]
（CNT，u）=等高线。等高线排序（CNT，方法=“从上到下”）
（CNT，u）=等高线。等高线排序（CNT，方法=“从左到右”）
印刷（碳纳米管）
completetext=[]
对于碳纳米管中的碳：
面积=cv2。轮廓面积（c）
如果500<面积尝试：
拆下所有管路
垂直连接部件（编号）
查找轮廓（文本列）并从左到右排序
根据柱轮廓对图像进行切片
使用适当的psm值白名单数字将各个切片传递给tesseract
希望这能解决您的问题。
如果您去掉线条，将其缩放3倍，然后使用PSM 6运行，您将获得非常好的识别。要去除线条，您可以使用白色像素查找它们并覆盖这些部分。为了确保你不改写字母，扔掉短线。神奇的人，它真的工作顺利。事实上，我没有调整图像的大小，我认为这就是问题所在。但是，你能解释一下将图像重新缩放到3倍是如何使识别效果如此好的吗？Tesseract喜欢以像素为单位的字符。你有一个高质量的图像，但字母的高度约为11像素。哦，是的，我应该彻底阅读苔丝博士。谢谢你，伙计