Python 如何使用pytesseract从宣传册图像中提取文本_Python_Python Tesseract

Python 如何使用pytesseract从宣传册图像中提取文本

python

Python 如何使用pytesseract从宣传册图像中提取文本,python,python-tesseract,Python,Python Tesseract,我已尝试从宣传册图像中提取文本：代码： import cv2 import pytesseract from PIL import Image im_folder = 'img_path' im_gray = cv2.imread(im_folder+'/'+'big-bazaar-wed-offer-may-21-2014.png', cv2.IMREAD_GRAYSCALE) #converting image to binary image (thresh, im_bw) = cv

我已尝试从宣传册图像中提取文本：

代码：

import cv2
import pytesseract
from PIL import Image

im_folder = 'img_path'
im_gray = cv2.imread(im_folder+'/'+'big-bazaar-wed-offer-may-21-2014.png', cv2.IMREAD_GRAYSCALE)

#converting image to binary image
(thresh, im_bw) = cv2.threshold(im_gray, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

#enhancing the image size
img = cv2.resize(im_bw,None,fx=4,fy=4, interpolation=cv2.INTER_AREA)
cv2.imwrite('im_enhance.png',img)

#Text extraction
text = pytesseract.image_to_string(Image.open('im_enhance.png'))
print(text)

由于这是一张宣传册图像，我将其转换为二进制图像，并对其进行增强，以获得更好的OCR结果

我可以用这段代码提取文本，但有些文本无法提取，尤其是金额/价格

为了提取所有文本，我应该做哪些更改