Python 3.x 如何使用pytesseract从工资单图像中提取指定文本

Python 3.x 如何使用pytesseract从工资单图像中提取指定文本,python-3.x,deep-learning,computer-vision,ocr,tesseract,Python 3.x,Deep Learning,Computer Vision,Ocr,Tesseract,我是tesseract OCR的新手,我有一堆工资单的图像,我想从工资单中自动提取日期,请帮我怎么做 首先,我试图从一张工资单中提取数据,它显示错误: import cv2 import pytesseract img = cv2.imread(r'E:/Receipts/Receipts/0a0ebd53.jpeg') pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe' T

我是tesseract OCR的新手,我有一堆工资单的图像,我想从工资单中自动提取日期,请帮我怎么做

首先,我试图从一张工资单中提取数据,它显示错误:

import cv2
import pytesseract
img = cv2.imread(r'E:/Receipts/Receipts/0a0ebd53.jpeg')
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'
TESSDATA_PREFIX='C:/Program Files/Tesseract-OCR/tessdata'
print(pytesseract.image_to_string(img))
# OR explicit beforehand converting
print(pytesseract.image_to_string(Image.fromarray(img))) 
错误:

200         }
    201 
--> 202         run_tesseract(**kwargs)
    203         filename = kwargs['output_filename_base'] + os.extsep + extension
    204         with open(filename, 'rb') as output_file:

~\Anaconda3\lib\site-packages\pytesseract\pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice)
    176 
    177     if status_code:
--> 178         raise TesseractError(status_code, get_errors(error_string))
    179 
    180     return True

TesseractError: (1, 'Error opening data file C:\\Program Files (x86)\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

请帮助我如何修复这个错误,同时,请给出一个深入的学习模型建议

请使用PIL库读取图像,然后将图像对象传递给图像到字符串(img_obj),如下所示

from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:/Program Files/TesseractOCR/tesseract.exe"
image_obj = Image.open(image_path)
print(pytesseract.image_to_string(image_obj))

是否有环境变量TESSDATA_前缀?错误信息中清楚地提到了它是的,文卡塔·克里希南,我也尝试过,我将环境变量添加到我的代码中,但它再次显示了相同的错误。你下载了语言吗?在指定路径中?没有文卡塔·克里希纳下载此文件,将其放在指定路径中。