Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/350.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 有没有办法使文档图像从任何位置垂直?_Python_Python 3.x_Image_Opencv_Image Processing - Fatal编程技术网

Python 有没有办法使文档图像从任何位置垂直?

Python 有没有办法使文档图像从任何位置垂直?,python,python-3.x,image,opencv,image-processing,Python,Python 3.x,Image,Opencv,Image Processing,我有这样的文件: #!/usr/bin/env python3 import numpy as np import pytesseract import cv2 import re from textblob import TextBlob def analyse(im, rotation): text = pytesseract.image_to_string(im, config="--psm 4") correctedText = TextBlob(text).correc

我有这样的文件:

#!/usr/bin/env python3

import numpy as np
import pytesseract
import cv2
import re
from textblob import TextBlob

def analyse(im, rotation):
   text = pytesseract.image_to_string(im, config="--psm 4")
   correctedText = TextBlob(text).correct()
   legit = []
   for found in correctedText.split():
      if found in words:
          legit.append(found)
   print(f"Rotation: {rotation}, word count: {len(legit)}, words: {legit}")

# Load dictionary of permissible words
words = set()
with open('/usr/share/dict/words') as f:
    for line in f:
        # Don't add short words like "at", tesseract often finds small, easily matched strings
        if len(line) > 5:
            words.add(line.rstrip())

# Load document
orig = cv2.imread('document.png',cv2.IMREAD_GRAYSCALE)
h, w = orig.shape
centre = (w//2, h//2)

# Iterate through orientations

# Original, no rotation
r = 0
cv2.imwrite(f'rotated-{r}.png',orig)
analyse(orig,0)

# 90 degrees
r = 90
rotated = cv2.rotate(orig, cv2.ROTATE_90_CLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 180 degrees
r = 180
rotated = cv2.rotate(orig, cv2.ROTATE_180) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 270 degrees
r = 270
rotated = cv2.rotate(orig, cv2.ROTATE_90_COUNTERCLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

在某些情况下,图像会向右旋转,甚至上下颠倒

向右旋转的文档示例:

颠倒旋转的文档示例:

无论起始位置如何,是否有办法使图像垂直

预期结果:


我认为您对EXIF存储方向的能力感到不满,有些观众对此不予理睬。最简单的方法是使用ImageMagick,它包含在大多数Linux发行版中,可用于macOS和Windows。在终端中使用此命令,或在Windows上使用命令提示符,将首先纠正方向,然后删除设置,以免混淆观众:

magick input.jpg -auto-orient -strip result.jpg
如果使用v6 ImageMagick,请将
magick
替换为
convert


如果做不到这一点,您可以通过每次将图像旋转90度来迭代四个可能的方向。在每个方向上,通过
pytesseract
运行图像,并选择与
/usr/share/dict/words.txt或系统上调用的任何内容最匹配的方向。为了增加乐趣和性能,将测试转换为函数,并在4个单独的线程上并行调用它-每个方向一个线程

可能看起来像这样:

#!/usr/bin/env python3

import numpy as np
import pytesseract
import cv2
import re
from textblob import TextBlob

def analyse(im, rotation):
   text = pytesseract.image_to_string(im, config="--psm 4")
   correctedText = TextBlob(text).correct()
   legit = []
   for found in correctedText.split():
      if found in words:
          legit.append(found)
   print(f"Rotation: {rotation}, word count: {len(legit)}, words: {legit}")

# Load dictionary of permissible words
words = set()
with open('/usr/share/dict/words') as f:
    for line in f:
        # Don't add short words like "at", tesseract often finds small, easily matched strings
        if len(line) > 5:
            words.add(line.rstrip())

# Load document
orig = cv2.imread('document.png',cv2.IMREAD_GRAYSCALE)
h, w = orig.shape
centre = (w//2, h//2)

# Iterate through orientations

# Original, no rotation
r = 0
cv2.imwrite(f'rotated-{r}.png',orig)
analyse(orig,0)

# 90 degrees
r = 90
rotated = cv2.rotate(orig, cv2.ROTATE_90_CLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 180 degrees
r = 180
rotated = cv2.rotate(orig, cv2.ROTATE_180) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 270 degrees
r = 270
rotated = cv2.rotate(orig, cv2.ROTATE_90_COUNTERCLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)
样本输出

Rotation: 0, word count: 43, words: ['between', 'Secession', 'deserted', 'above', 'noted', 'hereby', 'release', 'other', 'money', 'above', 'together', 'action', 'party', 'against', 'other', 'patty', 'holding', 'depart', 'Canada', 'refund', 'cashier', 'cheque', 'shall', 'their', 'irrevocable', 'author', 'hereby', 'commission', 'regeneration', 'above', 'except', 'hereinbefore', 'shall', 'binding', 'whereof', 'hereunto', 'presence', 'whereof', 'hereunto', 'presence', 'whereof', 'hereunto', 'presence']

Rotation: 90, word count: 0, words: []

Rotation: 180, word count: 10, words: ['saliva', 'sense', 'sleeping', 'anode', 'alone', 'sappy', 'sleeping', 'young', 'sawing', 'Utopian']

Rotation: 270, word count: 0, words: []
正如你所看到的,它发现了更多的单词和第一个未旋转的图像


关键词:Python、tesseract、Pyteseract、OCR、psm、配置、图像、图像处理、方向、自动方向、自动方向

我认为你对EXIF存储方向的能力感到不满,而有些观众却忽视了这一点。最简单的方法是使用ImageMagick,它包含在大多数Linux发行版中,可用于macOS和Windows。在终端中使用此命令,或在Windows上使用命令提示符,将首先纠正方向,然后删除设置,以免混淆观众:

magick input.jpg -auto-orient -strip result.jpg
如果使用v6 ImageMagick,请将
magick
替换为
convert


如果做不到这一点,您可以通过每次将图像旋转90度来迭代四个可能的方向。在每个方向上,通过
pytesseract
运行图像,并选择与
/usr/share/dict/words.txt或系统上调用的任何内容最匹配的方向。为了增加乐趣和性能,将测试转换为函数,并在4个单独的线程上并行调用它-每个方向一个线程

可能看起来像这样:

#!/usr/bin/env python3

import numpy as np
import pytesseract
import cv2
import re
from textblob import TextBlob

def analyse(im, rotation):
   text = pytesseract.image_to_string(im, config="--psm 4")
   correctedText = TextBlob(text).correct()
   legit = []
   for found in correctedText.split():
      if found in words:
          legit.append(found)
   print(f"Rotation: {rotation}, word count: {len(legit)}, words: {legit}")

# Load dictionary of permissible words
words = set()
with open('/usr/share/dict/words') as f:
    for line in f:
        # Don't add short words like "at", tesseract often finds small, easily matched strings
        if len(line) > 5:
            words.add(line.rstrip())

# Load document
orig = cv2.imread('document.png',cv2.IMREAD_GRAYSCALE)
h, w = orig.shape
centre = (w//2, h//2)

# Iterate through orientations

# Original, no rotation
r = 0
cv2.imwrite(f'rotated-{r}.png',orig)
analyse(orig,0)

# 90 degrees
r = 90
rotated = cv2.rotate(orig, cv2.ROTATE_90_CLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 180 degrees
r = 180
rotated = cv2.rotate(orig, cv2.ROTATE_180) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 270 degrees
r = 270
rotated = cv2.rotate(orig, cv2.ROTATE_90_COUNTERCLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)
样本输出

Rotation: 0, word count: 43, words: ['between', 'Secession', 'deserted', 'above', 'noted', 'hereby', 'release', 'other', 'money', 'above', 'together', 'action', 'party', 'against', 'other', 'patty', 'holding', 'depart', 'Canada', 'refund', 'cashier', 'cheque', 'shall', 'their', 'irrevocable', 'author', 'hereby', 'commission', 'regeneration', 'above', 'except', 'hereinbefore', 'shall', 'binding', 'whereof', 'hereunto', 'presence', 'whereof', 'hereunto', 'presence', 'whereof', 'hereunto', 'presence']

Rotation: 90, word count: 0, words: []

Rotation: 180, word count: 10, words: ['saliva', 'sense', 'sleeping', 'anode', 'alone', 'sappy', 'sleeping', 'young', 'sawing', 'Utopian']

Rotation: 270, word count: 0, words: []
正如你所看到的,它发现了更多的单词和第一个未旋转的图像

关键词:Python、tesseract、Pyteseract、OCR、psm、配置、图像、图像处理、方向、自动方向、自动方向

处理典型(矩形)和从左到右的文本,如示例所示,可以做出以下两个假设:

  • 纸张高度必须始终大于纸张宽度。这很容易检查。如果需要,旋转90度
  • 将在左侧找到比右侧更多的文本。因此,对所有行的像素值求和。文档左侧区域的总和必须大于右侧区域的总和。如果需要,旋转180度
以下是我使用的代码:

导入cv2
将numpy作为np导入
从skimage导入io#仅用于网络抓取图像;对于本地图像,请使用cv2.imread(…)
def正确的方向(img):
打印('\n图像:\n------')
h、 w=img.shape
如果(w>h):
img=cv2.旋转(img,cv2.顺时针旋转90°)
h、 w=img.shape
打印(“\n旋转90度”)
求和=np.求和(255 img,轴=0)
如果(np.sum(求和[30:130])
给定图像的输出:

图像:
------
图片:
------
旋转90度
旋转180度
图片:
------
旋转180度
图片:
------
在图像中,文档具有其他边框(蓝色或黑色)。这使得查找行的开头和结尾变得困难。因此,在最终解决方案中应调整左右区域的手动设定值

希望有帮助

编辑:忘记了以下可视化效果。对于正确定向的文档,所有行的合计值如下所示:

#!/usr/bin/env python3

import numpy as np
import pytesseract
import cv2
import re
from textblob import TextBlob

def analyse(im, rotation):
   text = pytesseract.image_to_string(im, config="--psm 4")
   correctedText = TextBlob(text).correct()
   legit = []
   for found in correctedText.split():
      if found in words:
          legit.append(found)
   print(f"Rotation: {rotation}, word count: {len(legit)}, words: {legit}")

# Load dictionary of permissible words
words = set()
with open('/usr/share/dict/words') as f:
    for line in f:
        # Don't add short words like "at", tesseract often finds small, easily matched strings
        if len(line) > 5:
            words.add(line.rstrip())

# Load document
orig = cv2.imread('document.png',cv2.IMREAD_GRAYSCALE)
h, w = orig.shape
centre = (w//2, h//2)

# Iterate through orientations

# Original, no rotation
r = 0
cv2.imwrite(f'rotated-{r}.png',orig)
analyse(orig,0)

# 90 degrees
r = 90
rotated = cv2.rotate(orig, cv2.ROTATE_90_CLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 180 degrees
r = 180
rotated = cv2.rotate(orig, cv2.ROTATE_180) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 270 degrees
r = 270
rotated = cv2.rotate(orig, cv2.ROTATE_90_COUNTERCLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

请参见左侧较大的值,这些是线的起点

180度旋转的文档也是如此:

#!/usr/bin/env python3

import numpy as np
import pytesseract
import cv2
import re
from textblob import TextBlob

def analyse(im, rotation):
   text = pytesseract.image_to_string(im, config="--psm 4")
   correctedText = TextBlob(text).correct()
   legit = []
   for found in correctedText.split():
      if found in words:
          legit.append(found)
   print(f"Rotation: {rotation}, word count: {len(legit)}, words: {legit}")

# Load dictionary of permissible words
words = set()
with open('/usr/share/dict/words') as f:
    for line in f:
        # Don't add short words like "at", tesseract often finds small, easily matched strings
        if len(line) > 5:
            words.add(line.rstrip())

# Load document
orig = cv2.imread('document.png',cv2.IMREAD_GRAYSCALE)
h, w = orig.shape
centre = (w//2, h//2)

# Iterate through orientations

# Original, no rotation
r = 0
cv2.imwrite(f'rotated-{r}.png',orig)
analyse(orig,0)

# 90 degrees
r = 90
rotated = cv2.rotate(orig, cv2.ROTATE_90_CLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 180 degrees
r = 180
rotated = cv2.rotate(orig, cv2.ROTATE_180) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

# 270 degrees
r = 270
rotated = cv2.rotate(orig, cv2.ROTATE_90_COUNTERCLOCKWISE) 
cv2.imwrite(f'rotated-{r}.png',rotated)
analyse(rotated,r)

同样,请注意由于附加图像边框而在边框上出现的“伪影”。

对于示例中所示的典型(矩形)和从左到右的文本,可以做出以下两个假设:

  • 纸张高度必须始终大于纸张宽度。这很容易检查。如果需要,旋转90度
  • 将在左侧找到比右侧更多的文本。因此,对所有行的像素值求和。文档左侧区域的总和必须大于右侧区域的总和。如果需要,旋转180度
以下是我使用的代码:

导入cv2
将numpy作为np导入
从skimage导入io#仅用于网络抓取图像;对于本地图像,请使用cv2.imread(。。