Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/mysql/65.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Pytesseract在乌尔都语和阿拉伯语文本中不返回任何内容_Python_Ocr_Arabic_Python Tesseract_Urdu - Fatal编程技术网

Python Pytesseract在乌尔都语和阿拉伯语文本中不返回任何内容

Python Pytesseract在乌尔都语和阿拉伯语文本中不返回任何内容,python,ocr,arabic,python-tesseract,urdu,Python,Ocr,Arabic,Python Tesseract,Urdu,使用Pytesseract将身份证图像转换为文本。到目前为止,我已经打破了部分中的名称地址Id卡号的图像,并使用 import pytesseract as tess from PIL import Image im = Image.open("Image.jpg") crop_rectangle = (20, 320, 400, 400) cropped_im = im.crop(crop_rectangle) text = tess.image_to_string(cropped_im, l

使用Pytesseract将身份证图像转换为文本。到目前为止,我已经打破了部分中的名称地址Id卡号的图像,并使用

import pytesseract as tess
from PIL import Image
im = Image.open("Image.jpg")
crop_rectangle = (20, 320, 400, 400)
cropped_im = im.crop(crop_rectangle)
text = tess.image_to_string(cropped_im, lang='ara')
print(text)
结果是空白的

另外,我也尝试过
text=tess.image\u to\u pdf\u或\u hocr(裁剪的im,lang='ara',extension='hocr')

这个加法步骤返回

b'<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"                
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<title></title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<meta name=\'ocr-system\' content=\'tesseract v5.0.0-alpha.20191030\' />  
<meta name=\'ocr-capabilities\' content=\'ocr_page ocr_carea ocr_par ocr_line ocrx_word     
ocrp_wconf\'/>
</head>
<body>  
<div class=\'ocr_page\' id=\'page_1\' title=\'image     
"C:\\Users\\MOHSIN~1.IFT\\AppData\\Local\\Temp\\tess_za20zk94.PNG"; bbox 0 0 380 80; ppageno 0\'>  
<div class=\'ocr_carea\' id=\'block_1_1\' title="bbox 0 0 380 80">
<p class=\'ocr_par\' id=\'par_1_1\' lang=\'ara\' title="bbox 0 0 380 80">    
<span class=\'ocr_line\' id=\'line_1_1\' title="bbox 0 0 380 80; baseline 0 0; x_size 108;     
x_descenders 27; x_ascenders 27">     
<span class=\'ocrx_word\' id=\'word_1_1\' title=\'bbox 0 0 380 80; x_wconf 95\'> </span>    
</span>
</p>   
</div>
</div>
</body>
</html>'
b'

'
需要帮助将乌尔都语/阿拉伯语图像转换为文本吗
提前谢谢你

嘿,@ProgSMI你能解决这个问题吗,我面临着同样的问题。嘿,@ProgSMI你能解决这个问题吗,我面临着同样的问题。