使用python检测图像中的文本_Python_Image Processing

使用python检测图像中的文本

python image-processing

使用python检测图像中的文本,python,image-processing,Python,Image Processing,我有大约100多张图片，上面有两个不同的文本。图片如下。一个被占用，另一个未被占用那么，在python中，有没有办法使用一些代码来检测图像中的文本来区分这些图像呢如果是这样的话，我想识别占用的图像并删除未占用的图像。既然我是python新手，有人能帮我做这件事吗使用和python包装器，这只是几行的任务： import pytesseract from PIL import Image pytesseract.pytesseract.tesseract_cmd = r"C:\Prog

我有大约100多张图片，上面有两个不同的文本。图片如下。一个被占用，另一个未被占用

那么，在python中，有没有办法使用一些代码来检测图像中的文本来区分这些图像呢

如果是这样的话，我想识别占用的图像并删除未占用的图像。既然我是python新手，有人能帮我做这件事吗

使用和python包装器，这只是几行的任务：

import pytesseract
from PIL import Image

pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe"
img = Image.open('D:\\tmp2.jpg').crop((0,0,250,35))
print(pytesseract.image_to_string(img, config='--psm 7'))

我已经在Windows7上测试过了。当然，我假设文本在每个图像中都出现在相同的位置（从您的示例来看，似乎确实如此）。否则，你需要找到一种更好的裁剪机制。

这个答案是基于这样的假设，即你在问题中发布的图像上只有两个不同的文本。因此，我假设字符的数量和文本的颜色总是相同的（“房间状态：未占用”和“房间状态”以红色显示为“已占用”）。也就是说，我会尝试一种更简单的方法来区分这两种不同的类型。这些图像包含彼此非常接近的字符，因此在我看来，将每个字符分开并用OCR识别是非常困难的。我会尝试一种更简单的方法，如查找包含文本的区域and查找文本的纯长度-“未占用”在文本中还有两个字符为“已占用”，因此长度距离更大。因此，您可以将图像转换为HSV颜色空间，并使用

cv2.inRange（）

函数提取文本（红色）。然后，您可以使用

cv2.morphologyEx（）

将角色合并到一个轮廓，并使用

cv2.minarealect（）

获取其长度。希望这对您有所帮助，或至少为您找到解决方案提供一个新的视角。干杯

示例代码：

import cv2
import numpy as np

# Read the image and transform to HSV colorspace.
img = cv2.imread('ocupied.jpg')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)

# Extract the red text.
lower_red = np.array([0,150,50])
upper_red = np.array([40,255,255])
mask_red = cv2.inRange(hsv, lower_red, upper_red)

# Search for contours on the mask.
_, contours, hierarchy = cv2.findContours(mask_red,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)

# Create a new mask for further processing.
mask = np.ones(img.shape, np.uint8)*255

# Draw contours on the mask with size and ratio of borders for threshold (to remove other noises from the image).
for cnt in contours:
    size = cv2.contourArea(cnt)
    x,y,w,h = cv2.boundingRect(cnt)
    if 10000 > size > 50 and w*2.5 > h:
        cv2.drawContours(mask, [cnt], -1, (0,0,0), -1)

# Connect neighbour contours and select the biggest one (text).
kernel = np.ones((50,50),np.uint8)
opening = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
gray_op = cv2.cvtColor(opening, cv2.COLOR_BGR2GRAY)
_, threshold_op = cv2.threshold(gray_op, 150, 255, cv2.THRESH_BINARY_INV)
_, contours_op, hierarchy_op = cv2.findContours(threshold_op, cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours_op, key=cv2.contourArea)

# Create rotated rectangle to get the 4 points of the rectangle.
rect = cv2.minAreaRect(cnt)

# Create bounding and calculate the "lenght" of the text.
box = cv2.boxPoints(rect)
a, b, c, d = box = np.int0(box)
bound =[]
bound.append(a)
bound.append(b)
bound.append(c)
bound.append(d)
bound = np.array(bound)
(x1, y1) = (bound[:,0].min(), bound[:,1].min())
(x2, y2) = (bound[:,0].max(), bound[:,1].max())

# Draw the rectangle.
cv2.rectangle(img,(x1,y1),(x2,y2),(0,255,0),1)

# Identify the room status.   
if x2 - x1 > 200:
    print('unoccupied')
else:
    print('occupied')

# Display the result
cv2.imshow('img', img)

结果:

这是一项非常复杂的任务，如果你完全不熟悉python，更是如此。我建议你检查opencv进行图像检测。@对于一个真正叫Tesseract的人，我很惊讶你没有推荐Google Tesseract字符识别。除了像opencv（或Tesseract ocr）这样的软件系统外，你还需要“清洁的“图像-在本例中，将其裁剪为文本，并尽可能清晰地显示图像。“这也太复杂了！”阿利斯泰尔·卡斯卡登说，“也许我的评论有点过于严厉了。”。即使对于初学者来说，Tesseract似乎也很容易使用。我错误地认为文字图像识别将超出新程序员的能力。抱歉。我只是在拿你的名字开玩笑，并向询问者推荐软件。这是该项目的链接。

occupied

unoccupied