Python 在tesseract上使用图像字符串时出现问题_Python_Python 3.x_Tesseract_Archlinux

Python 在tesseract上使用图像字符串时出现问题

python python-3.x

Python 在tesseract上使用图像字符串时出现问题,python,python-3.x,tesseract,archlinux,Python,Python 3.x,Tesseract,Archlinux,大家好，我有使用tesseract的python简单代码，但我认为这是与版本相关的问题，或者类似的问题，请看一下代码： from PIL import Image import pytesseract file = '/home/gxs/Downloads/a.png' img = Image.open(file) text = pytesseract.image_to_string(Image.open(file)) 为此，我有以下输出（错误）： TesserCharacterRor回溯（最

大家好，我有使用tesseract的python简单代码，但我认为这是与版本相关的问题，或者类似的问题，请看一下代码：

from PIL import Image
import pytesseract
file = '/home/gxs/Downloads/a.png'
img = Image.open(file)
text = pytesseract.image_to_string(Image.open(file))

为此，我有以下输出（错误）：

TesserCharacterRor回溯（最后一次调用）
在里面
4 img=Image.open（文件）
5#显示（img）
---->6 text=pytesseract.image_到_字符串（image.open（文件））
映像\u到\u字符串中的~/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py（映像、lang、配置、nice、输出类型、超时）
368 args=[image，'txt'，lang，config，nice，timeout]
369
-->370返回{
371 Output.BYTES:lambda:run_和_get_Output（*（args+[True]），
372 Output.DICT:lambda:{'text'：运行和获取输出（*args）}，
（）
371 Output.BYTES:lambda:run_和_get_Output（*（args+[True]），
372 Output.DICT:lambda:{'text'：运行和获取输出（*args）}，
-->373 Output.STRING:lambda:run_和_get_Output（*args），
374}[输出类型]（）
375
运行和获取输出中的~/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py（映像、扩展名、lang、配置、nice、超时、返回字节）
280         }
281
-->282运行时间（**kwargs）
283 filename=kwargs['output\u filename\u base']+extsep+扩展名
284打开（文件名为“rb”）作为输出文件：
运行\u tesseract时~/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py（输入\u文件名，输出\u文件名\u基，扩展名，lang，配置，nice，超时）
256，超时管理器（进程，超时）作为错误字符串：
257如果程序返回代码：
-->258 raise TESSERATERROR（proc.returncode，get_errors（error_string））
259
260
TesserActor:（-11，'Tesseract开源OCR Engine v3.03，带有Leptonica actual_TesserData_num_entries）。您可能缺少经过培训的模型的OCR数据。您尝试过安装它们吗？我已经安装了Tesseract data eng，我也尝试过全部安装
TesseractError                            Traceback (most recent call last)
<ipython-input-1-65b8cbea5fe0> in <module>
      4 img = Image.open(file)
      5 #display(img)
----> 6 text = pytesseract.image_to_string(Image.open(file))

~/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py in image_to_string(image, lang, config, nice, output_type, timeout)
    368     args = [image, 'txt', lang, config, nice, timeout]
    369 
--> 370     return {
    371         Output.BYTES: lambda: run_and_get_output(*(args + [True])),
    372         Output.DICT: lambda: {'text': run_and_get_output(*args)},

~/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py in <lambda>()
    371         Output.BYTES: lambda: run_and_get_output(*(args + [True])),
    372         Output.DICT: lambda: {'text': run_and_get_output(*args)},
--> 373         Output.STRING: lambda: run_and_get_output(*args),
    374     }[output_type]()
    375 

~/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py in run_and_get_output(image, extension, lang, config, nice, timeout, return_bytes)
    280         }
    281 
--> 282         run_tesseract(**kwargs)
    283         filename = kwargs['output_filename_base'] + extsep + extension
    284         with open(filename, 'rb') as output_file:

~/.local/lib/python3.8/site-packages/pytesseract/pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice, timeout)
    256     with timeout_manager(proc, timeout) as error_string:
    257         if proc.returncode:
--> 258             raise TesseractError(proc.returncode, get_errors(error_string))
    259 
    260 

TesseractError: (-11, 'Tesseract Open Source OCR Engine v3.03 with Leptonica actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line 53')