Python 3.x TesserCharacterRor:';打开数据文件\\程序文件(x86)\\Tesseract OCR\\eng.traineddata时出错

Python 3.x TesserCharacterRor:';打开数据文件\\程序文件(x86)\\Tesseract OCR\\eng.traineddata时出错,python-3.x,tesseract,python-tesseract,Python 3.x,Tesseract,Python Tesseract,我正试图在Jupyter笔记本上使用Pyteseract Windows 10 x64 以管理权限运行Jupyter笔记本(Anaconda3,Python 3.8.3) 包含TIFF文件的工作目录位于不同的驱动器(Z:) 当我运行以下代码时: pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe' img = cv2.imread('1.png') img = cv

我正试图在Jupyter笔记本上使用Pyteseract

Windows 10 x64 以管理权限运行Jupyter笔记本(Anaconda3,Python 3.8.3)

包含TIFF文件的工作目录位于不同的驱动器(Z:) 当我运行以下代码时:

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract.exe'
img = cv2.imread('1.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

hImg, wImg,_ = img.shape
conf = r'--oem 3 --psm 6 outputbase digits'
boxes = pytesseract.image_to_data(img, config=conf)
for a,b in enumerate(boxes.splitlines()):
        print(b)
        if a!=0:
            b = b.split()
            if len(b)==12:
                x,y,w,h = int(b[6]),int(b[7]),int(b[8]),int(b[9])
                cv2.putText(img,b[11],(x,y-5),cv2.FONT_HERSHEY_SIMPLEX,1,(50,50,255),2)
                cv2.rectangle(img, (x,y), (x+w, y+h), (50, 50, 255), 2)

cv2.imshow('Img',img)
cv2.waitKey(0)
我得到以下错误:

TesseractError                            Traceback (most recent call last)
<ipython-input-46-ec2bc4b38a7a> in <module>
      1 hImg, wImg,_ = img.shape
      2 conf = r'--oem 3 --psm 6 outputbase digits'
----> 3 boxes = pytesseract.image_to_data(img, config=conf)
      4 # boxes = pytesseract.image_to_data(img, config=tessdata_dir_config)
      5 for a,b in enumerate(boxes.splitlines()):

~\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py in image_to_data(image, lang, config, nice, output_type, timeout, pandas_config)
    460     args = [image, 'tsv', lang, config, nice, timeout]
    461 
--> 462     return {
    463         Output.BYTES: lambda: run_and_get_output(*(args + [True])),
    464         Output.DATAFRAME: lambda: get_pandas_output(

~\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py in <lambda>()
    466         ),
    467         Output.DICT: lambda: file_to_dict(run_and_get_output(*args), '\t', -1),
--> 468         Output.STRING: lambda: run_and_get_output(*args),
    469     }[output_type]()
    470 

~\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py in run_and_get_output(image, extension, lang, config, nice, timeout, return_bytes)
    280         
    281 
--> 282         run_tesseract(**kwargs)
    283         filename = kwargs['output_filename_base'] + extsep + extension
    284         with open(filename, 'rb') as output_file:

~\AppData\Local\Programs\Python\Python38\lib\site-packages\pytesseract\pytesseract.py in run_tesseract(input_filename, output_filename_base, extension, lang, config, nice, timeout)
    256     with timeout_manager(proc, timeout) as error_string:
    257         if proc.returncode:
--> 258             raise TesseractError(proc.returncode, get_errors(error_string))
    259 
    260 

TesseractError: (1, 'Error opening data file \\Program Files (x86)\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')
TesserCharacterRor回溯(最后一次调用)
在里面
1 hImg,wImg,u=img.shape
2 conf=r'--oem 3--psm 6输出基本位数'
---->3个框=pytesseract.image_to_数据(img,config=conf)
4#box=pytesseract.image_to_data(img,config=tessdata_dir_config)
5对于枚举中的a,b(box.splitlines()):
~\AppData\Local\Programs\Python\Python38\lib\site packages\pytesseract\pytesseract.py在image\u to\u数据中(image,lang,config,nice,output\u type,timeout,pandas\u config)
460 args=[图像,'tsv',lang,config,nice,超时]
461
-->462返回{
463 Output.BYTES:lambda:run_和_get_Output(*(args+[True]),
464 Output.DATAFRAME:lambda:get\u\u输出(
()
466         ),
467 Output.DICT:lambda:file_to_DICT(运行和获取输出(*args),'\t',-1),
-->468 Output.STRING:lambda:run_和_get_Output(*args),
469}[输出类型]()
470
运行和获取输出中的~\AppData\Local\Programs\Python\Python38\lib\site packages\pytesseract\pytesseract.py(图像、扩展名、语言、配置、nice、超时、返回字节)
280
281
-->282运行时间(**kwargs)
283 filename=kwargs['output\u filename\u base']+extsep+扩展名
284打开(文件名为“rb”)作为输出文件:
~\AppData\Local\Programs\Python\Python38\lib\site packages\pytesseract\pytesseract.py在运行\u tesseract时(输入\u文件名,输出\u文件名\u基,扩展名,lang,配置,nice,超时)
256,超时管理器(进程,超时)作为错误字符串:
257如果程序返回代码:
-->258 raise TESSERATERROR(proc.returncode,get_errors(error_string))
259
260
TesserAttribute错误:(1,'打开数据文件\\程序文件(x86)时出错\\Tesseract OCR\\eng.traineddata请确保将TesserData\u前缀环境变量设置为“TesserData”目录。加载语言“eng\”失败。Tesseract无法加载任何语言!无法初始化Tesseract'
我想通过环境变量而不是像在pytesseract.image_中设置config变量那样的代码来解决这个问题