Python 如何用汉字打印tesseract结果_Python_Tesseract

Python 如何用汉字打印tesseract结果

python

Python 如何用汉字打印tesseract结果,python,tesseract,Python,Tesseract,我试着让我的程序用Tesseract来识别中文，它是有效的。我遇到的唯一问题是，结果不是用汉字打印，而是用拼音打印（你如何用英文输入汉字） #导入库从PIL导入图像导入pytesseract 从unidecode导入unidecode pytesseract.pytesseract.tesseract_cmd=r“C:\Program Files\tesseract OCR\tesseract.exe” 图像_计数器=2 filelimit=image\u计数器-1 outfile=“out

我试着让我的程序用Tesseract来识别中文，它是有效的。我遇到的唯一问题是，结果不是用汉字打印，而是用拼音打印（你如何用英文输入汉字）

#导入库
从PIL导入图像
导入pytesseract
从unidecode导入unidecode
pytesseract.pytesseract.tesseract_cmd=r“C:\Program Files\tesseract OCR\tesseract.exe”
图像_计数器=2
filelimit=image\u计数器-1
outfile=“out\u text.txt”
f=打开（文件外“a”）
对于范围内的i（1，filelimit+1）：
打印（“ran”）
filename=“页面”+str（i）+“.png”
#使用pytesserct将文本识别为图像中的字符串
text=unidecode（（pytesseract.image_to_字符串（image.open（filename），lang=“chi_sim”））
打印（文本）

这是我运行的图像

这就是我得到的

ran
清明时节余奋芬，吕尚兴任余端阙
新汶九家和楚友，木依通志强华村。

结果应该是图中所示的汉字。

没关系，我意识到了我的问题

text=unidecode（（pytesseract.image_to_string（image.open（filename），lang=“chi_sim”））

应该是

text=pytesseract.image\u to\u字符串（image.open（文件名），lang=“chi\u tra”）