在PHP中为正则表达式模式生成所有可能的匹配项_Php_Regex_Parsing

在PHP中为正则表达式模式生成所有可能的匹配项

php regex parsing

在PHP中为正则表达式模式生成所有可能的匹配项,php,regex,parsing,Php,Regex,Parsing,关于如何解析正则表达式模式并将所有可能的匹配输出到该模式，有很多问题需要解决。但是，出于某种原因，我能找到的每一个（、、可能更多）都是针对Java或某种C语言的（只有一个是针对JavaScript的），我目前需要用PHP来实现这一点我已经在谷歌上搜索到了我最喜欢的内容，但无论我做什么，谷歌给我的几乎是唯一的东西，就是preg_match（）的文档链接和关于如何使用regex的页面，这与我在这里想要的正好相反我的正则表达式模式都非常简单，保证是有限的；使用的唯一语法是：字符类的[] （）用

关于如何解析正则表达式模式并将所有可能的匹配输出到该模式，有很多问题需要解决。但是，出于某种原因，我能找到的每一个（、、可能更多）都是针对Java或某种C语言的（只有一个是针对JavaScript的），我目前需要用PHP来实现这一点

我已经在谷歌上搜索到了我最喜欢的内容，但无论我做什么，谷歌给我的几乎是唯一的东西，就是

preg_match（）

的文档链接和关于如何使用regex的页面，这与我在这里想要的正好相反

我的正则表达式模式都非常简单，保证是有限的；使用的唯一语法是：

字符类的
```
[]
```
```
（）
```
用于子组（不需要捕获）
```
|
```
（管道）用于子组内的备选匹配
```
？
```
用于零匹配或一匹配

因此，一个例子可能是

[ct]hun（k | der）（s | ed | ing）

，以匹配动词chunk、chunder和thunder的所有可能形式，总共16种排列

理想情况下，应该有一个PHP库或工具，它将迭代（有限）正则表达式模式，并输出所有可能的匹配项，一切就绪。有人知道这样的库/工具是否已经存在吗

如果不是，什么是一个优化的方法来制作一个？因为JavaScript是我所能找到的最接近我应该能够适应的东西，但不幸的是，我无法理解它的实际工作原理，这使得适应变得更加棘手。另外，无论如何，在PHP中可能有更好的实现方法。对于如何最好地分解任务的一些逻辑指针，我们将不胜感激

编辑：由于显然不清楚这在实践中会是什么样子，我正在寻找允许这种类型输入的东西：

$possibleMatches = parseRegexPattern('[ct]hun(k|der)(s|ed|ing)?');

–然后打印

$possibleMatches

应该会给出类似的结果（在我的情况下，元素的顺序并不重要）：

方法

你需要去掉可变模式；您可以使用

preg\u match\u all

执行此操作

preg_match_all("/(\[\w+\]|\([\w|]+\))/", '[ct]hun(k|der)(s|ed|ing)?', $matches);

/* Regex:

/(\[\w+\]|\([\w|]+\))/
/                       : Pattern delimiter
 (                      : Start of capture group
  \[\w+\]               : Character class pattern
         |              : OR operator
          \([\w|]+\)    : Capture group pattern
                    )   : End of capture group
                     /  : Pattern delimiter

*/

然后可以将捕获组扩展为字母或单词（取决于类型）

递归地遍历每个

$array

代码

附加功能扩展嵌套组在使用中，你会把它放在“preg_match_all”之前

输出：

This happen(s|ed) to (become|be|have|having) test case 1?

This happens to become test case 1
This happens to become test case 
This happens to be test case 1
This happens to be test case 
This happens to have test case 1
This happens to have test case 
This happens to having test case 1
This happens to having test case 
This happened to become test case 1
This happened to become test case 
This happened to be test case 1
This happened to be test case 
This happened to have test case 1
This happened to have test case 
This happened to having test case 1
This happened to having test case

匹配单字母其要点是更新正则表达式：

$matchPattern = "/(?:(\[\w+\]|\([\w|]+\))\??|(\w\?))/";

并将

else

添加到

preoptions

函数中：

} else {
    $array = [$cleanString];
}

完整的工作示例输出：

This happen(s|ed) to (become|be|have|having) test case 1?

This happens to become test case 1
This happens to become test case 
This happens to be test case 1
This happens to be test case 
This happens to have test case 1
This happens to have test case 
This happens to having test case 1
This happens to having test case 
This happened to become test case 1
This happened to become test case 
This happened to be test case 1
This happened to be test case 
This happened to have test case 1
This happened to have test case 
This happened to having test case 1
This happened to having test case

所以，你不想匹配模式，你想看看模式会匹配什么“单词”？@Steven是的，没错。我基本上想把一个模式转换成一个非正则表达式的列表，其中包含模式可能匹配的所有字符串？显然，任何带有

或量词（例如

）的东西都可能意味着相当长的列表@史蒂文：是的，这就是为什么我指定我的模式都是有限的（也就是说，它们不包含任何可能使匹配字符串列表无限的东西）——除了问题中列出的四个之外，根本没有使用正则表达式语法。好吧，如果是这样的话，你可以简单地去掉变量组，根据需要将它们拆分（按字母表示字符类，按单词表示捕获组），然后使用递归在每个级别上进行处理？更新为包含

？

功能。这非常巧妙…我需要更仔细地阅读它，以便在我的头脑中正确地绘制出内部工作，但我得到了流程的要点。不幸的是，它不能与嵌套的子组一起工作，而嵌套的子组也是如此我最难理解的部分。我尝试了一些匹配模式的排列（比如

/（\[\w+\]\\\\）（（（（？>[^（）]+）|（？R））\）\？？/

），但都不起作用。我想知道是否有可能在一个

preg\u match\u all（）

call。为了给出一个嵌套子匹配模式的实际示例，我使用了

andagts（bog（en）| bøger（ne）？）s？

。这是真的，至少对于无限递归来说是这样。在实践中，递归最多不会超过三个级别，因此仍然是有限的，尽管比没有递归更复杂。我确实可以控制源代码，所以也许我应该通过它来避免嵌套的子组。反正子组并不多，而且它们都是c可以很容易地转换为非嵌套的替代方案…您可以始终扩展嵌套组？类似于：

echo preg\u replace（/（\（\（\）\）（\w+）\（（\w+）\？/”、“$1$2•$2$3”、$pattern）；

（此代码段还假设嵌套组始终采用

（…）？

的形式）我最终只是删除了嵌套的组——总共只有大约12个（在大约2000个模式中）。虽然此方法不会将

？

扩展到包含或不包含前面实体的变体，并且在某些情况下它不会按预期进行扩展，但就我的目的而言，它工作得足够好。目标是为使用插件IndexMatic生成的InDesign文档中的索引创建查询列表文件。IndexMatic支持正则表达式，但是它阻塞了我的查询列表文件，可能是因为有太多的正则表达式，太复杂了。

} else {
    $array = [$cleanString];
}

function printMatches($pattern, $array, $matchPattern)
{
    $currentArray = array_shift($array);

    foreach ($currentArray as $option) {
        $patternModified = preg_replace($matchPattern, $option, $pattern, 1);
        if (!count($array)) {
            echo $patternModified, PHP_EOL;
        } else {
            printMatches($patternModified, $array, $matchPattern);
        }
    }
}

function prepOptions($matches)
{
    foreach ($matches as $match) {
        $cleanString = preg_replace("/[\[\]\(\)\?]/", "", $match);
        
        if ($match[0] === "[") {
            $array = str_split($cleanString, 1);
        } elseif ($match[0] === "(") {
            $array = explode("|", $cleanString);
        } else {
            $array = [$cleanString];
        }
        if ($match[-1] === "?") {
            $array[] = "";
        }
        $possibilites[] = $array;
    }
    return $possibilites;
}

$regex        = 'This happen(s|ed) to (be(come)?|hav(e|ing)) test case 1?';
$matchPattern = "/(?:(\[\w+\]|\([\w|]+\))\??|(\w\?))/";

$regex = preg_replace_callback("/(\(|\|)(\w+)(?:\(([\w\|]+)\)\??)/", function($array){
    $output = explode("|", $array[3]);
    if ($array[0][-1] === "?") {
        $output[] = "";
    }
    foreach ($output as &$option) {
        $option = $array[2] . $option;
    }
    return $array[1] . implode("|", $output);
}, $regex);


preg_match_all($matchPattern, $regex, $matches);

printMatches(
    $regex,
    prepOptions($matches[0]),
    $matchPattern
);

This happens to become test case 1
This happens to become test case 
This happens to be test case 1
This happens to be test case 
This happens to have test case 1
This happens to have test case 
This happens to having test case 1
This happens to having test case 
This happened to become test case 1
This happened to become test case 
This happened to be test case 1
This happened to be test case 
This happened to have test case 1
This happened to have test case 
This happened to having test case 1
This happened to having test case