使用Python删除图像验证码中的行_Python_Numpy_Opencv_Image Processing

使用Python删除图像验证码中的行

python numpy opencv image-processing

使用Python删除图像验证码中的行,python,numpy,opencv,image-processing,Python,Numpy,Opencv,Image Processing,我使用了这个链接，并编辑了提供的代码，以从下面给出的伪验证码中删除行 lineremovation.py from PIL import Image,ImageFilter from scipy.misc import toimage from operator import itemgetter from skimage import measure import numpy as np import heapq import cv2 import matplotlib.pyplot as

我使用了这个链接，并编辑了提供的代码，以从下面给出的伪验证码中删除行

lineremovation.py

from PIL import Image,ImageFilter
from scipy.misc import toimage
from operator import itemgetter
from skimage import measure
import numpy as np
import heapq
import cv2
import matplotlib.pyplot as plt
from scipy.ndimage.filters import median_filter



#----------------------------------------------------------------

class preprocessing:
    def pre_proc_image(self,img):
        img_removed_noise=self.apply_median_filter(img)
        #img_removed_noise=self.remove_noise(img)
        p1,p2,LL=self.get_line_position(img_removed_noise)
        img=self.remove_line(p1,p2,LL,img_removed_noise)
        img=median_filter(np.asarray(img),1)
        return img

    def remove_noise(self,img):
        img_gray=img.convert('L')
        w,h=img_gray.size
        max_color=np.asarray(img_gray).max()
        pix_access_img=img_gray.load()
        row_img=list(map(lambda x:255 if x in range(max_color-15,max_color+1) else 0,np.asarray(img_gray.getdata())))
        img=np.reshape(row_img,[h,w])
        return img

    def apply_median_filter(self,img):
        img_gray=img.convert('L')
        img_gray=cv2.medianBlur(np.asarray(img_gray),3)
        img_bw=(img_gray>np.mean(img_gray))*255
        return img_bw

    def eliminate_zeros(self,vector):
        return [(dex,v) for (dex,v) in enumerate(vector) if v!=0 ]

    def get_line_position(self,img):
        sumx=img.sum(axis=0)
        list_without_zeros=self.eliminate_zeros(sumx)
        min1,min2=heapq.nsmallest(2,list_without_zeros,key=itemgetter(1))
        l=[dex for [dex,val] in enumerate(sumx) if val==min1[1] or val==min2[1]]
        mindex=[l[0],l[len(l)-1]]
        cols=img[:,mindex[:]]
        col1=cols[:,0]
        col2=cols[:,1]
        col1_without_0=self.eliminate_zeros(col1)
        col2_without_0=self.eliminate_zeros(col2)
        line_length=len(col1_without_0)
        dex1=col1_without_0[round(len(col1_without_0)/2)][0]
        dex2=col2_without_0[round(len(col2_without_0)/2)][0]
        p1=[dex1,mindex[0]]
        p2=[dex2,mindex[1]]
        return p1,p2,line_length

    def remove_line(self,p1,p2,LL,img):
        m=(p2[0]-p1[0])/(p2[1]-p1[1]) if p2[1]!=p1[1] else np.inf
        w,h=len(img),len(img[0])
        x=list(range(h))
        y=list(map(lambda z : int(np.round(p1[0]+m*(z-p1[1]))),x))
        img_removed_line=list(img)
        for dex in range(h):
            i,j=y[dex],x[dex]
            i=int(i)
            j=int(j)
            rlist=[]
            while i>=0 and i<len(img_removed_line)-1:
                f1=i
                if img_removed_line[i][j]==0 and img_removed_line[i-1][j]==0:
                    break
                rlist.append(i)
                i=i-1
            i,j=y[dex],x[dex]
            i=int(i)
            j=int(j)
            while i>=0 and i<len(img_removed_line)-1:
                f2=i
                if img_removed_line[i][j]==0 and img_removed_line[i+1][j]==0:
                    break
                rlist.append(i)
                i=i+1
            if np.abs(f2-f1) in [LL+1,LL,LL-1]:
                rlist=list(set(rlist))
                for k in rlist:
                    img_removed_line[k][j]=0

        return img_removed_line

if __name__ == '__main__':
    image = cv2.imread("captcha.png")
    img = Image.fromarray(image)
    p = preprocessing()
    imgNew = p.pre_proc_image(img)
    cv2.imshow("Input", np.array(image))
    cv2.imshow('Output', np.array(imgNew, dtype=np.uint8))
    cv2.waitKey(0)

从PIL导入图像，ImageFilter
从scipy.misc导入到图像
从运算符导入itemgetter
从脱脂进口措施
将numpy作为np导入
进口heapq
进口cv2
将matplotlib.pyplot作为plt导入
从scipy.ndimage.filters导入中值滤波器
#----------------------------------------------------------------
类预处理：
def预处理程序映像（自身、img）：
img_去除的噪声=自身。应用中值滤波器（img）
#img\u去除的噪声=自身。去除噪声（img）
p1，p2，LL=自身获取线位置（img去除噪声）
img=自身移除线（p1、p2、LL、img移除噪声）
img=中值滤波器（np.asarray（img），1）
返回img
def去除_噪音（自身、img）：
img_gray=img.convert（'L'）
w、 h=img_灰色尺寸
max\u color=np.asarray（img\u gray）.max（）
pix\u访问\u img=img\u gray.load（）
行\u img=list（映射（如果x在范围内（max\u color-15，max\u color+1），则λx:255），否则为0，则为np.asarray（img\u gray.getdata（）））
img=np.重塑（行[h，w]）
返回img
def应用中值过滤器（自身、img）：
img_gray=img.convert（'L'）
img_gray=cv2.medianBlur（np.asarray（img_gray），3）
img_bw=（img_gray>np.均值（img_gray））*255
返回img_bw
def消除_零点（自、向量）：
如果v！=0，则返回枚举（向量）中（dex，v）的[（dex，v）]
def获取线路位置（自身、img）：
sumx=img.和（轴=0）
列出不带零的零=自消零（sumx）
min1，min2=heapq.nsmalest（2，不带零的列表，key=itemgetter（1））
如果val==min1[1]或val==min2[1]，则枚举（sumx）中[dex，val]的l=[dex]
mindex=[l[0]，l[len（l）-1]]
cols=img[：，mindex[：]]
col1=cols[：，0]
col2=cols[：，1]
没有0的col1\u=自消零（col1）
没有0的col2\u=自消零（col2）
线条长度=len（没有0的col1）
dex1=col1_无_0[四舍五入（len（col1_无_0）/2）][0]
dex2=col2_无_0[四舍五入（len（col2_无_0）/2）][0]
p1=[dex1，mindex[0]]
p2=[dex2，mindex[1]]
返回p1，p2，行长度
def拆卸管路（自身、p1、p2、LL、img）：
m=（p2[0]-p1[0]）/（p2[1]-p1[1]），如果p2[1]=p1[1]其他np.inf
w、 h=len（img），len（img[0]）
x=列表（范围（h））
y=列表（映射（λz:int（np.圆形（p1[0]+m*（z-p1[1]）），x））
img\u已删除\u行=列表（img）
对于范围内的指数（h）：
i、 j=y[dex]，x[dex]
i=int（i）
j=int（j）
rlist=[]
在这种特殊情况下，当i>=0，i=0，i时，行的密度似乎小于字符密度。
因此，通过应用一些阈值方法，您可以删除线条：
例如，下一行提供了以下信息：

retval，image=cv2.threshold（image，12555，cv2.THRESH_二进制）

稍后，通过应用一些噪声消除方法，如中值（来自您自己的代码），您可以得到以下结果：
一行解决了我想做的一切，但我不明白我以前做错了什么。