Regex 匹配两个或多个不相同的字符_Regex_Regex Negation

Regex 匹配两个或多个不相同的字符

regex

Regex 匹配两个或多个不相同的字符,regex,regex-negation,Regex,Regex Negation,是否可以编写一个正则表达式模式来匹配abc，其中每个字母不是文字，而是意味着像xyz（但不是xxy）这样的文本将被匹配？我能在ab中找到与a相匹配的（）（？！\1），但后来我被难住了在得到下面的答案后，我能够编写一个例程来生成这个模式。使用原始re模式要比将模式和文本都转换为规范格式然后进行共享快得多 def pat2re(p, know=None, wild=None): """return a compiled re pattern that will find pattern `

是否可以编写一个正则表达式模式来匹配

abc

，其中每个字母不是文字，而是意味着像

xyz

（但不是

xxy

）这样的文本将被匹配？我能在

ab

中找到与

相匹配的

（）（？！\1）

，但后来我被难住了

在得到下面的答案后，我能够编写一个例程来生成这个模式。使用原始

re

模式要比将模式和文本都转换为规范格式然后进行共享快得多

def pat2re(p, know=None, wild=None):
    """return a compiled re pattern that will find pattern `p`
    in which each different character should find a different
    character in a string. Characters to be taken literally
    or that can represent any character should be given as
    `know` and `wild`, respectively.

    EXAMPLES
    ========

    Characters in the pattern denote different characters to
    be matched; characters that are the same in the pattern
    must be the same in the text:

    >>> pat = pat2re('abba')
    >>> assert pat.search('maccaw')
    >>> assert not pat.search('busses')

    The underlying pattern of the re object can be seen
    with the pattern property:

    >>> pat.pattern
    '(.)(?!\\1)(.)\\2\\1'    

    If some characters are to be taken literally, list them
    as known; do the same if some characters can stand for
    any character (i.e. are wildcards):

    >>> a_ = pat2re('ab', know='a')
    >>> assert a_.search('ad') and not a_.search('bc')

    >>> ab_ = pat2re('ab*', know='ab', wild='*')
    >>> assert ab_.search('abc') and ab_.search('abd')
    >>> assert not ab_.search('bad')

    """
    import re
    # make a canonical "hash" of the pattern
    # with ints representing pattern elements that
    # must be unique and strings for wild or known
    # values
    m = {}
    j = 1
    know = know or ''
    wild = wild or ''
    for c in p:
        if c in know:
            m[c] = '\.' if c == '.' else c
        elif c in wild:
            m[c] = '.'
        elif c not in m:
            m[c] = j
            j += 1
            assert j < 100
    h = tuple(m[i] for i in p)
    # build pattern
    out = []
    last = 0
    for i in h:
        if type(i) is int:
            if i <= last:
                out.append(r'\%s' % i)
            else:
                if last:
                    ors = '|'.join(r'\%s' % i for i in range(1, last + 1))
                    out.append('(?!%s)(.)' % ors)
                else:
                    out.append('(.)')
                last = i
        else:
            out.append(i)
    return re.compile(''.join(out))

def pat2re（p，know=None，wild=None）：
“”“返回将找到模式`p'的已编译的re模式`
每个不同的角色都应该找到不同的
字符串中的字符。按字面理解的字符
或者可以表示任何字符的
`分别知道`和`野生'。
例子
========
模式中的字符表示要使用的不同字符
匹配；模式中相同的字符
文本中必须相同：
>>>pat=pat2re（‘abba’）
>>>断言pat.search（'maccaw'）
>>>assert not pat.search（'busses'））
可以看到re对象的基本模式
使用pattern属性：
>>>帕特模式
'(.)(?!\\1)(.)\\2\\1'    
如果要按字面理解某些字符，请列出它们
已知；如果某些字符可以代表
任何字符（即通配符）：
>>>a=pat2re（'ab'，know='a'）
>>>断言一个搜索（'ad'），而不是一个搜索（'bc'））
>>>ab_u2;=pat2re（'ab*'，know='ab'，wild='*'））
>>>断言ab搜索（'abc'）和ab搜索（'abd'）
>>>assert not abu.search（'bad'）
"""
进口稀土
#对模式进行规范化的“散列”
#使用int表示
#必须是唯一的，并且字符串为wild或known
#价值观
m={}
j=1
知道
野生的
对于p中的c：
如果c不知道：
m[c]='\.'如果c=='.'否则c
野生环境中的elif c：
m[c]='。'
如果c不在m中：
m[c]=j
j+=1
断言j<100
h=元组（p中i的m[i]
#构建模式
out=[]
最后一个=0
对于h中的i：
如果类型（i）为int：
如果我你可以试试：
^(.)(?!\1)(.)(?!\1|\2).$


下面是对正则表达式模式的解释：
^          from the start of the string
(.)        match and capture any first character (no restrictions so far)
(?!\1)     then assert that the second character is different from the first
(.)        match and capture any (legitimate) second character
(?!\1|\2)  then assert that the third character does not match first or second
.          match any valid third character
$          end of string

谢谢：“向前看，接受”是我的新概念。因此，要查找abba
而不是cccc
，我们可以使用（）（？~\1）（）\2\1
而不是（）（）\2\1
。