Regex 带异常的数字屏蔽的正则表达式

Regex 带异常的数字屏蔽的正则表达式,regex,Regex,我想在简历中隐藏电话号码,简历中还包括2001年、2001-03年的日期和百分比45%87%78.45%56.5% 我只想掩盖电话号码,不需要完全掩盖。如果我能屏蔽3或4个数字,让人很难猜,那就行了。请帮帮我 Phone number formats are 9876543210 98765 43210 98765-43210 9876 543 210 9876-543-210 只是在路上帮你。。。我会用python来做的就是。 使用re模块搜索类似数字的字符串: import re num

我想在简历中隐藏电话号码,简历中还包括2001年、2001-03年的日期和百分比
45%87%78.45%56.5%

我只想掩盖电话号码,不需要完全掩盖。如果我能屏蔽3或4个数字,让人很难猜,那就行了。请帮帮我

Phone number formats are 
9876543210
98765 43210
98765-43210
9876 543 210
9876-543-210

只是在路上帮你。。。我会用python来做的就是。 使用
re
模块搜索类似数字的字符串:

import re
num_re = re.compile('[0-9 -]{5,}')
with open('/my/file', 'r') as f:
    for l in f:
        for s in num_re.findall(l):
            # Do some addition testing, like 'not starting with' or any
            l.replace(s, '!!!MASKED!!!')
        print l
我并不是说这段代码已经完成了,但它应该可以帮助您

顺便问一下,我为什么要使用这种方法:

  • 您可以轻松添加任何您喜欢的测试来修复误报

  • 它是可读的

    • 以下是我的答案:

       (([0-9][- ]*){5})(([0-9][- ]*){5})
      
      它将精确匹配10位数字,带或不带
      -
      或空格

      之后,您可以使用
      ***
      或任何您喜欢的内容替换第一组或第三组

      例如:

      $1*****
      

      \d{4,5}[-]?\d{3}[-]?\d{2,3}

      匹配的字符串:

      def fun(m):
        if m:
          return '*'*len(m.group(1))+m.group(2)
      
      string = "Resume of candidate abcd. His phone numbers are : 9876543210, 98765 43210, 98765-43210.Date of birth of the candidate is 23-10-2013. His percentage is 57%. One more number 9876 543 213 His percentage in grad school is 44%. Another number  9876-543-210"
      
      re.sub('(\d{4,5})([ -]?\d{3}[ -]?\d{2,3})',fun,string)
      
      'Resume of candidate abcd. His phone numbers are : *****43210, *****
      43210, *****-43210. Date of birth of the candidate is 23-10-2013. His
      percentage is 57%. One more number **** 543 213 His percentage in grad
      school is 44%. Another number  ****-543-210'
      
      9876543210、98765 43210、98765-43210、9876 543 210、9876-543-210

      字符串不匹配:

      def fun(m):
        if m:
          return '*'*len(m.group(1))+m.group(2)
      
      string = "Resume of candidate abcd. His phone numbers are : 9876543210, 98765 43210, 98765-43210.Date of birth of the candidate is 23-10-2013. His percentage is 57%. One more number 9876 543 213 His percentage in grad school is 44%. Another number  9876-543-210"
      
      re.sub('(\d{4,5})([ -]?\d{3}[ -]?\d{2,3})',fun,string)
      
      'Resume of candidate abcd. His phone numbers are : *****43210, *****
      43210, *****-43210. Date of birth of the candidate is 23-10-2013. His
      percentage is 57%. One more number **** 543 213 His percentage in grad
      school is 44%. Another number  ****-543-210'
      
      我觉得不需要一个与无效电话号码不匹配的更复杂的正则表达式,因为需要屏蔽上述格式的有效电话号码

      检查

      Python代码:

      def fun(m):
        if m:
          return '*'*len(m.group(1))+m.group(2)
      
      string = "Resume of candidate abcd. His phone numbers are : 9876543210, 98765 43210, 98765-43210.Date of birth of the candidate is 23-10-2013. His percentage is 57%. One more number 9876 543 213 His percentage in grad school is 44%. Another number  9876-543-210"
      
      re.sub('(\d{4,5})([ -]?\d{3}[ -]?\d{2,3})',fun,string)
      
      'Resume of candidate abcd. His phone numbers are : *****43210, *****
      43210, *****-43210. Date of birth of the candidate is 23-10-2013. His
      percentage is 57%. One more number **** 543 213 His percentage in grad
      school is 44%. Another number  ****-543-210'
      
      输出:

      def fun(m):
        if m:
          return '*'*len(m.group(1))+m.group(2)
      
      string = "Resume of candidate abcd. His phone numbers are : 9876543210, 98765 43210, 98765-43210.Date of birth of the candidate is 23-10-2013. His percentage is 57%. One more number 9876 543 213 His percentage in grad school is 44%. Another number  9876-543-210"
      
      re.sub('(\d{4,5})([ -]?\d{3}[ -]?\d{2,3})',fun,string)
      
      'Resume of candidate abcd. His phone numbers are : *****43210, *****
      43210, *****-43210. Date of birth of the candidate is 23-10-2013. His
      percentage is 57%. One more number **** 543 213 His percentage in grad
      school is 44%. Another number  ****-543-210'
      
      有关re.sub的更多信息:

      def fun(m):
        if m:
          return '*'*len(m.group(1))+m.group(2)
      
      string = "Resume of candidate abcd. His phone numbers are : 9876543210, 98765 43210, 98765-43210.Date of birth of the candidate is 23-10-2013. His percentage is 57%. One more number 9876 543 213 His percentage in grad school is 44%. Another number  9876-543-210"
      
      re.sub('(\d{4,5})([ -]?\d{3}[ -]?\d{2,3})',fun,string)
      
      'Resume of candidate abcd. His phone numbers are : *****43210, *****
      43210, *****-43210. Date of birth of the candidate is 23-10-2013. His
      percentage is 57%. One more number **** 543 213 His percentage in grad
      school is 44%. Another number  ****-543-210'
      
      re.sub(模式、应答、字符串、计数=0、标志=0)

      返回通过替换最左边的非重叠字符串获得的字符串 替换repl在字符串中出现的模式。如果 找不到模式,返回的字符串未更改。repl可以是一个 字符串或函数


      电话号码有多长?10位数?有详细信息吗?有十位数字,但格式不同,如前所述。9876543210、98765 43210、98765-43210、9876543 210、9876-543-210Javascript、记事本++、Java、PCRE?你会将正则表达式与什么一起使用?