Python 提高pytesseract图像正确文本识别率
我正在尝试使用pytesseract模块读取验证码。大多数时候,但不是所有时候,它都能给出准确的文本 这是读取图像、操作图像和从图像中提取文本的代码Python 提高pytesseract图像正确文本识别率,python,opencv,image-processing,ocr,python-tesseract,Python,Opencv,Image Processing,Ocr,Python Tesseract,我正在尝试使用pytesseract模块读取验证码。大多数时候,但不是所有时候,它都能给出准确的文本 这是读取图像、操作图像和从图像中提取文本的代码 import cv2 import numpy as np import pytesseract def read_captcha(): # opencv loads the image in BGR, convert it to RGB img = cv2.cvtColor(cv2.imread('captcha.png'),
import cv2
import numpy as np
import pytesseract
def read_captcha():
# opencv loads the image in BGR, convert it to RGB
img = cv2.cvtColor(cv2.imread('captcha.png'), cv2.COLOR_BGR2RGB)
lower_white = np.array([200, 200, 200], dtype=np.uint8)
upper_white = np.array([255, 255, 255], dtype=np.uint8)
mask = cv2.inRange(img, lower_white, upper_white) # could also use threshold
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))) # "erase" the small white points in the resulting mask
mask = cv2.bitwise_not(mask) # invert mask
# load background (could be an image too)
bk = np.full(img.shape, 255, dtype=np.uint8) # white bk
# get masked foreground
fg_masked = cv2.bitwise_and(img, img, mask=mask)
# get masked background, mask must be inverted
mask = cv2.bitwise_not(mask)
bk_masked = cv2.bitwise_and(bk, bk, mask=mask)
# combine masked foreground and masked background
final = cv2.bitwise_or(fg_masked, bk_masked)
mask = cv2.bitwise_not(mask) # revert mask to original
# resize the image
img = cv2.resize(mask,(0,0),fx=3,fy=3)
cv2.imwrite('ocr.png', img)
text = pytesseract.image_to_string(cv2.imread('ocr.png'), lang='eng')
return text
对于图像的处理,我从这篇文章中得到了帮助
这是原始验证码图像:
此图像是在操作后生成的:
但是,通过使用pytesseract,我得到了文本:AX#7rL
有谁能指导我如何将成功率提高到100%吗?由于生成的图像中存在小孔,因此形态学变换,特别是cv2.MORPH\u CLOSE,在这里可以关闭小孔并平滑图像 获取二值图像(黑白) 执行以关闭前景中的小孔 反转图像以获得结果 4X#7rL 在插入tesseract之前,可能有一个
cv2.GaussianBlur()
,这也会有所帮助
import cv2
import pytesseract
# Path for Windows
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Read in image as grayscale
image = cv2.imread('1.png',0)
# Threshold to obtain binary image
thresh = cv2.threshold(image, 220, 255, cv2.THRESH_BINARY)[1]
# Create custom kernel
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
# Perform closing (dilation followed by erosion)
close = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# Invert image to use for Tesseract
result = 255 - close
cv2.imshow('thresh', thresh)
cv2.imshow('close', close)
cv2.imshow('result', result)
# Throw image into tesseract
print(pytesseract.image_to_string(result))
cv2.waitKey()
谢谢,它很有魅力。不过我有一个要求。你能给我解释一下密码吗?在上面的代码中添加注释会很好。谢谢。当然,我添加了评论。从本质上讲,我们读取图像并执行预处理步骤,直到得到一个干净的图像,然后将其放入tesseract。通过
cv2.imshow()