Python 3.x Pytesseract-彩色文本图像上的OCR_Python 3.x_Macos_Ocr_Python Tesseract

Python 3.x Pytesseract-彩色文本图像上的OCR

python-3.x macos

Python 3.x Pytesseract-彩色文本图像上的OCR,python-3.x,macos,ocr,python-tesseract,Python 3.x,Macos,Ocr,Python Tesseract,我正在尝试使用pyteseract在图像中获取一些文本。但是，文本为橙色，背景为黑色和白色。我尝试了几种选择，但最终我无法使用Pytesseract阅读文本。下面是图像的一个示例：以下是我得到的代码： import pytesseract from PIL import Image,ImageOps import numpy as np img = Image.open("OCR.png").convert("L") img = ImageOps.i

我正在尝试使用pyteseract在图像中获取一些文本。但是，文本为橙色，背景为黑色和白色。我尝试了几种选择，但最终我无法使用Pytesseract阅读文本。下面是图像的一个示例：

以下是我得到的代码：

import pytesseract
from PIL import Image,ImageOps
import numpy as np

img = Image.open("OCR.png").convert("L")
img = ImageOps.invert(img)
# img.show()
threshold = 240
table = []
pixelArray = img.load()
for y in range(img.size[1]):  # binaryzate it
    List = []
    for x in range(img.size[0]):
        if pixelArray[x,y] < threshold:
            List.append(0)
        else:
            List.append(255)
    table.append(List)

img = Image.fromarray(np.array(table, dtype="uint8")) # load the image from array.
# img.show()

print(pytesseract.image_to_string(img))

导入pytesseract
从PIL导入图像，图像操作
将numpy作为np导入
img=Image.open（“OCR.png”）.convert（“L”）
img=图像操作反转（img）
#img.show（）
阈值=240
表=[]
pixelArray=img.load（）
对于范围内的y（img.size[1]）：#对其进行二值化
列表=[]
对于范围内的x（img.size[0]）：
如果像素阵列[x，y]<阈值：
列表。追加（0）
其他：
列表。追加（255）
表.追加（列表）
img=Image.fromarray（np.array（table，dtype=“uint8”））#从数组加载图像。
#img.show（）
打印（PyteSeract.image_到_字符串（img））

上面的代码生成全黑图像。文本也变成黑色