Python正则表达式匹配或潜在匹配问题:_Python_Regex

Python正则表达式匹配或潜在匹配问题:

python regex

Python正则表达式匹配或潜在匹配问题:,python,regex,Python,Regex,如何使用Python的正则表达式模块（re）来确定是否已进行匹配，或者是否可能进行匹配细节：我想要一个正则表达式模式，它以正确的顺序搜索单词的模式，而不管它们之间是什么。我想要一个函数，如果找到，则返回Yes，如果仍然可以找到匹配项，则返回可能，如果找不到匹配项，则返回No。我们正在寻找模式One | Two | Two | Two | Two | Two | Two | Two，下面是一些示例（请注意，姓名、数量或顺序并不重要，我所关心的只是三个词，1、2和3，中间可接受的词是John、M

如何使用Python的正则表达式模块（

re

）来确定是否已进行匹配，或者是否可能进行匹配

细节：我想要一个正则表达式模式，它以正确的顺序搜索单词的模式，而不管它们之间是什么。我想要一个函数，如果找到，则返回

Yes

，如果仍然可以找到匹配项，则返回

可能

，如果找不到匹配项，则返回

No

。我们正在寻找模式

One | Two | Two | Two | Two | Two | Two | Two

，下面是一些示例（请注意，姓名、数量或顺序并不重要，我所关心的只是三个词，

、

和

，中间可接受的词是

John

、

Malkovich

、

Stamos

和

Travolta

）

返回是：

One|John|Malkovich|Two|John|Stamos|Three|John|Travolta

One|John|Two|John|Three|John

One|Two|Three

One|Two

One

Three|Two|One

返回是：

One|John|Malkovich|Two|John|Stamos|Three|John|Travolta

One|John|Two|John|Three|John

One|Two|Three

One|Two

One

Three|Two|One

返回是：

One|John|Malkovich|Two|John|Stamos|Three|John|Travolta

One|John|Two|John|Three|John

One|Two|Three

One|Two

One

Three|Two|One

可能返回：

One|John|Malkovich|Two|John|Stamos|Three|John|Travolta

One|John|Two|John|Three|John

One|Two|Three

One|Two

One

Three|Two|One

可能返回：

One|John|Malkovich|Two|John|Stamos|Three|John|Travolta

One|John|Two|John|Three|John

One|Two|Three

One|Two

One

Three|Two|One

返回否：

One|John|Malkovich|Two|John|Stamos|Three|John|Travolta

One|John|Two|John|Three|John

One|Two|Three

One|Two

One

Three|Two|One

我理解这些示例不是无懈可击的，因此以下是我对正则表达式的理解：

if re.match('One\|(John\||Malkovich\||Stamos\||Travolta\|)*Two\|(John\||Malkovich\||Stamos\||Travolta\|)*Three\|(John\||Malkovich\||Stamos\||Travolta\|)*', 'One|John|Malkovich|Two|John|Stamos|Three|John|Travolta') != None
   return 'Yes'

显然，如果模式是

Three | Two | One

，上述操作将失败，我们可以返回

No

，但是如何检查

Maybe

情况？我考虑过嵌套括号，就像这样（注意，未测试）

但我不认为这会达到我想要的效果

更多详情：我并不是在寻找特拉沃尔塔和马尔科维奇（我知道这很令人震惊）。我正在与inotify模式进行匹配，例如移动中的

，\u创建中的，\u打开中的，\u修改中的，我正在记录它们并获取数百个，然后我进入，然后在访问中查找特定模式，例如在修改中的。。。。。，但在某些情况下，我不希望在删除中使用ETE
在_OPEN
中的之后，我和其他人一样。我基本上是使用inotify进行模式匹配，以检测文本编辑器何时失控，他们试图通过执行临时文件交换保存而不是仅仅修改文件来粉碎程序员的灵魂。我不想立即释放这些日志，但我只想保留它们作为我的目标有必要时继续。可能
意味着不要删除日志。是
意味着做点什么，然后删除日志，否
意味着什么都不做，但仍然会删除日志。因为每个程序都有多个规则（即vim
vgedit
vemacs
）我想使用一个正则表达式，它将更易于人类阅读和编写，然后创建一个巨大的树，或者按照用户的建议，只是用一个循环遍历单词
有些人在遇到问题时会想“我知道，我会使用正则表达式。”现在他们有两个问题了。-杰米·扎温斯基
也许这样的算法更合适。下面是一些伪代码
matchlist.current = matchlist.first()
for each word in input
    if word = matchlist.current
        matchlist.current = matchlist.next() // assuming next returns null if at end of list
    else if not allowedlist.contains(word)
        return 'No'
if matchlist.current = null // we hit the end of the list
    return 'Yes'
return 'Maybe'

有些人在遇到问题时会想“我知道，我会使用正则表达式。”现在他们有两个问题了。-杰米·扎温斯基
也许这样的算法更合适。下面是一些伪代码
matchlist.current = matchlist.first()
for each word in input
    if word = matchlist.current
        matchlist.current = matchlist.next() // assuming next returns null if at end of list
    else if not allowedlist.contains(word)
        return 'No'
if matchlist.current = null // we hit the end of the list
    return 'Yes'
return 'Maybe'

我不会为此使用正则表达式，但这绝对是可能的：
regex = re.compile(
    r"""^           # Start of string
    (?:             # Match...
     (?:            # one of the following:
      One()         # One (use empty capturing group to indicate match)
     |              # or
      \1Two()       # Two if One has matched previously
     |              # or
      \1\2Three()   # Three if One and Two have matched previously
     |              # or
      John          # any of the other strings
     |              # etc.
      Malkovich
     |
      Stamos
     |
      Travolta
     )              # End of alternation
     \|?            # followed by optional separator
    )*              # any number of repeats
    $               # until the end of the string.""", 
    re.VERBOSE)

现在，您可以检查是，也可以通过检查是否匹配：
>>> yes = regex.match("One|John|Malkovich|Two|John|Stamos|Three|John|Travolta")
>>> yes
<_sre.SRE_Match object at 0x0000000001F90620>
>>> maybe = regex.match("One|John|Malkovich|Two|John|Stamos")
>>> maybe
<_sre.SRE_Match object at 0x0000000001F904F0>

如果正则表达式根本不匹配，那么您就不能：
>>> no = regex.match("Three|Two|One")
>>> no is None
True

我不会为此使用正则表达式，但这绝对是可能的：
regex = re.compile(
    r"""^           # Start of string
    (?:             # Match...
     (?:            # one of the following:
      One()         # One (use empty capturing group to indicate match)
     |              # or
      \1Two()       # Two if One has matched previously
     |              # or
      \1\2Three()   # Three if One and Two have matched previously
     |              # or
      John          # any of the other strings
     |              # etc.
      Malkovich
     |
      Stamos
     |
      Travolta
     )              # End of alternation
     \|?            # followed by optional separator
    )*              # any number of repeats
    $               # until the end of the string.""", 
    re.VERBOSE)

现在，您可以检查是，也可以通过检查是否匹配：
>>> yes = regex.match("One|John|Malkovich|Two|John|Stamos|Three|John|Travolta")
>>> yes
<_sre.SRE_Match object at 0x0000000001F90620>
>>> maybe = regex.match("One|John|Malkovich|Two|John|Stamos")
>>> maybe
<_sre.SRE_Match object at 0x0000000001F904F0>

如果正则表达式根本不匹配，那么您就不能：
>>> no = regex.match("Three|Two|One")
>>> no is None
True

你确定使用正则表达式是正确的方法吗？你可以简单地[逐字逐句地输入文本。split（）如果单词列表中有单词]
然后检查该列表的顺序。@JoelCornett hmmmm。请参见编辑plz在一定数量的可接受噪声中（John
，Stamos
）什么会返回？或One | Three
？@TimPietzcker我添加了更多信息，但我正在寻找的正是模式One | Two | Three
）是你的“可能”匹配总是在字符串的末尾？你确定使用正则表达式是正确的方法吗？可以简单地[文本中逐字逐句。如果单词列表中有单词，则拆分（）
然后检查该列表的顺序。@JoelCornett hmmmm。请参见编辑plz在一定数量的可接受噪声中（John
，Stamos
）什么会返回？或One | Three
？@TimPietzcker我添加了更多信息，但我正在寻找的正是模式One | Two | Three
）是你的“可能”匹配总是在字符串的末尾？我认为您的伪代码中有错误。=
第3行表示相等，而第4行表示相等assignment@Joel：您升级到11.5版了吗？它有一个新功能，可以让它理解预期的含义并透明地编译。；）我认为你的伪代码有错误<代码>=
第3行表示相等，第4行表示相等assignment@Joel：您升级到11.5版了吗？它有一个新的功能，可以让它明白预期的含义并透明地编译。；）这肯定会让事情变得更难，然后是列表和树。这肯定会让事情变得更难，然后是列表和树