Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/328.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用PyteSeract/OpenCV绘制边界框_Python_Opencv_Jupyter Notebook_Computer Vision_Python Tesseract - Fatal编程技术网

Python 使用PyteSeract/OpenCV绘制边界框

Python 使用PyteSeract/OpenCV绘制边界框,python,opencv,jupyter-notebook,computer-vision,python-tesseract,Python,Opencv,Jupyter Notebook,Computer Vision,Python Tesseract,我正在使用pytesseract(0.3.2)和openCV(4.1.2)来识别图像中的数字。当image_to_字符串工作时,image_to_数据和image_to_框不工作。我需要能够绘制图像上的边界框,这让我感到困惑。我尝试过不同的图像,pytesseract的旧版本,等等。我正在使用Windows和Jupyter笔记本电脑 import cv2 import pytesseract #erosion def erode(image): kernel = np.ones((5

我正在使用pytesseract(0.3.2)和openCV(4.1.2)来识别图像中的数字。当image_to_字符串工作时,image_to_数据和image_to_框不工作。我需要能够绘制图像上的边界框,这让我感到困惑。我尝试过不同的图像,pytesseract的旧版本,等等。我正在使用Windows和Jupyter笔记本电脑

import cv2 
import pytesseract

#erosion
def erode(image):
    kernel = np.ones((5,5),np.uint8)
    return cv2.erode(image, kernel, iterations = 1)

#grayscale
def get_grayscale(image):
    return cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

#thresholding
def thresholding(image):
    #return cv2.adaptiveThreshold(image, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
    return cv2.threshold(image, 200, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

img = cv2.imread('my_image.jpg')
pytesseract.pytesseract.tesseract_cmd = r'C:\mypath\tesseract.exe'

gray = get_grayscale(img)
thresh = thresholding(gray)
erode = remove_noise(thresh)

custom_config = r'-c tessedit_char_whitelist=0123456789 --psm 6'
print(pytesseract.image_to_string(erode, config=custom_config))

cv2.imwrite("test.jpg", erode)

#these return nothing
print(pytesseract.image_to_boxes(Image.open('test.jpg')))
print(pytesseract.image_to_data(Image.open('test.jpg')))

代替使用
图像到框
,另一种方法是简单地使用查找轮廓,获得边界矩形坐标,并使用绘制边界框

使用此示例输入图像

画框

光学字符识别结果

1234567890
代码


请尝试以下代码:

from pytesseract import Output
import pytesseract
import cv2
 
image = cv2.imread("my_image.jpg")

#swap color channel ordering from BGR (OpenCV’s default) to RGB (compatible with Tesseract and pytesseract).
# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 
pytesseract.pytesseract.tesseract_cmd = r'C:\mypath\tesseract.exe'
custom_config = r'-c tessedit_char_whitelist=0123456789 --psm 6'
results = pytesseract.image_to_data(rgb, output_type=Output.DICT,lang='eng',config=custom_config)
boxresults = pytesseract.image_to_boxes(rgb,output_type=Output.DICT,lang='eng',config=custom_config)
print(results)
print(boxresults)

for i in range(0, len(results["text"])):
    # extract the bounding box coordinates of the text region from the current result
    tmp_tl_x = results["left"][i]
    tmp_tl_y = results["top"][i]
    tmp_br_x = tmp_tl_x + results["width"][i]
    tmp_br_y = tmp_tl_y + results["height"][i] 
    tmp_level = results["level"][i]
    conf = results["conf"][i]
    text = results["text"][i]
    
    if(tmp_level == 5):
        cv2.putText(image, text, (tmp_tl_x, tmp_tl_y - 10), cv2.FONT_HERSHEY_SIMPLEX,0.5, (0, 0, 255), 1)
        cv2.rectangle(image, (tmp_tl_x, tmp_tl_y), (tmp_br_x, tmp_br_y), (0, 0, 255), 1)
        
for j in range(0,len(boxresults["left"])):
    left = boxresults["left"][j]
    bottom = boxresults["bottom"][j]
    right = boxresults["right"][j]
    top = boxresults["top"][j] 
    cv2.rectangle(image, (left, top), (right, bottom), (255, 0, 0), 1)
       
    
cv2.imshow("image",image)
cv2.waitKey(0)

有趣的是,10年后,同样的问题还是一次又一次地出现。不管怎样,你在帮助别人方面做得很好!我看不出用这种方法打印预测字符及其对应框的方法。我不明白,边界框不是你想要的吗?
from pytesseract import Output
import pytesseract
import cv2
 
image = cv2.imread("my_image.jpg")

#swap color channel ordering from BGR (OpenCV’s default) to RGB (compatible with Tesseract and pytesseract).
# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
 
pytesseract.pytesseract.tesseract_cmd = r'C:\mypath\tesseract.exe'
custom_config = r'-c tessedit_char_whitelist=0123456789 --psm 6'
results = pytesseract.image_to_data(rgb, output_type=Output.DICT,lang='eng',config=custom_config)
boxresults = pytesseract.image_to_boxes(rgb,output_type=Output.DICT,lang='eng',config=custom_config)
print(results)
print(boxresults)

for i in range(0, len(results["text"])):
    # extract the bounding box coordinates of the text region from the current result
    tmp_tl_x = results["left"][i]
    tmp_tl_y = results["top"][i]
    tmp_br_x = tmp_tl_x + results["width"][i]
    tmp_br_y = tmp_tl_y + results["height"][i] 
    tmp_level = results["level"][i]
    conf = results["conf"][i]
    text = results["text"][i]
    
    if(tmp_level == 5):
        cv2.putText(image, text, (tmp_tl_x, tmp_tl_y - 10), cv2.FONT_HERSHEY_SIMPLEX,0.5, (0, 0, 255), 1)
        cv2.rectangle(image, (tmp_tl_x, tmp_tl_y), (tmp_br_x, tmp_br_y), (0, 0, 255), 1)
        
for j in range(0,len(boxresults["left"])):
    left = boxresults["left"][j]
    bottom = boxresults["bottom"][j]
    right = boxresults["right"][j]
    top = boxresults["top"][j] 
    cv2.rectangle(image, (left, top), (right, bottom), (255, 0, 0), 1)
       
    
cv2.imshow("image",image)
cv2.waitKey(0)