基于二值图像的OCR_Ocr_Google Cloud Vision_Python Tesseract

基于二值图像的OCR

基于二值图像的OCR,ocr,google-cloud-vision,python-tesseract,Ocr,Google Cloud Vision,Python Tesseract,我有一个像这样的二进制文本图像我想对这些图像执行OCR。它们只包含一个单词。我试过tesseract和Google cloud vision，但都没有结果。我正在使用Python3.6和Windows10 # export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json import io import cv2 from PIL import Image # Imports the Google Cloud client libr

我有一个像这样的二进制文本图像

我想对这些图像执行OCR。它们只包含一个单词。我试过tesseract和Google cloud vision，但都没有结果。我正在使用Python3.6和Windows10

# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image

# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types

# Instantiates a client
client = vision.ImageAnnotatorClient()

with io.open("test.png", 'rb') as image_file:
    content = image_file.read()

image = types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
resp = ''

for text in texts:
    resp+=' ' + text.description

print(resp)

from PIL import Image as im
import pytesseract as ts
print(ts.image_to_string(im.fromarray(canvas.reshape((480,640)),'L'))) # canvas contains the Mat object from which the image is saved to png

这张图片对于两者中的任何一个来说都应该是一个简单的任务，我觉得我的代码中缺少了一些东西。请帮帮我

编辑：

感谢F10为我指明了正确的方向。这就是我如何让它与当地的形象

# export GOOGLE_APPLICATION_CREDENTIALS=kyourcredentials.json
import io
import cv2
from PIL import Image

# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
from google.cloud.vision import enums

# Instantiates a client
client = vision.ImageAnnotatorClient()

with io.open("test.png", 'rb') as image_file:
    content = image_file.read()

features = [
    types.Feature(type=enums.Feature.Type.DOCUMENT_TEXT_DETECTION)
]


image = types.Image(content=content)

request = types.image_annotator_pb2.AnnotateImageRequest(image=image, features=features)
response = client.annotate_image(request)

print(response)

基于，我使用了以下代码，并能够将text:cat\n作为输出：

from pprint import pprint

# Imports the Google Cloud client library
from google.cloud import vision

# Instantiates a client
client = vision.ImageAnnotatorClient()

# The name of the image file to annotate
response = client.annotate_image({
  'image': {'source': {'image_uri': 'gs://<your_bucket>/ORW90.png'}},
  'features': [{'type': vision.enums.Feature.Type.DOCUMENT_TEXT_DETECTION}],
})

pprint(response)

希望能有帮助