Regex 基于模板生成所有字符串组合_Regex_Algorithm_Combinations

Regex 基于模板生成所有字符串组合

regex algorithm

Regex 基于模板生成所有字符串组合,regex,algorithm,combinations,Regex,Algorithm,Combinations,如何根据模板生成所有字符串组合例如： -模板字符串 “{我|我们}想要{2 | 3 | 4}{苹果|梨}” 大括号“{…}”表示一组或多个单词，每个单词之间用“|”分隔该类应使用每个词组中的每个词组合生成字符串我知道这是有限自动机，也是正则表达式。如何有效地生成组合比如说 G[0][j][want]G[1][j]G[2][j]” 首先，生成所有可能的组合c=[0..1][0..2][0..1]： 000 001 010 011 020 021 100 101 110 111 120

如何根据模板生成所有字符串组合

例如： -模板字符串

“{我|我们}想要{2 | 3 | 4}{苹果|梨}”

大括号“{…}”表示一组或多个单词，每个单词之间用“|”分隔

该类应使用每个词组中的每个词组合生成字符串

我知道这是有限自动机，也是正则表达式。如何有效地生成组合

比如说

G[0][j][want]G[1][j]G[2][j]”

首先，生成所有可能的组合c=[0..1][0..2][0..1]：

然后对于每个c，用G[i][c[i]]替换G[i][j]

将每组字符串{…}转换为

字符串数组

，这样就有了n个数组。所以对于

“{I|We}想要{2 | 3 | 4}{apples | pears}”

我们将有4个数组

将这些数组中的每一个放入另一个数组中。在我的示例中，我将调用

集合

这是Java代码，但足够简单，您应该能够将其转换为任何语言

void makeStrings(String[][] wordSet, ArrayList<String> collection) {
       makeStrings(wordSet, collection, "", 0, 0);
}

void makeStrings(String[][] wordSet, ArrayList<String> collection, String currString, int x_pos, int y_pos) {

    //If there are no more wordsets in the whole set add the string (this means 1 combination is completed)
    if (x_pos >= wordSet.length) {
        collection.add(currString);
        return; 
    }


        //Else if y_pos is outof bounds (meaning no more words within the smaller set {...} return
    else if (y_pos >= wordSet[x_pos].length) { 
        return;
    } 



    else {
            //Generate 2 new strings, one to send "vertically " and one "horizontally"
            //This string accepts the current word at x.y and then moves to the next word subset
            String combo_x = currString + " " + wordSet[x_pos][y_pos];
            makeStrings(wordSet, collection, combo_x, x_pos + 1, 0);

            //Create a copy of the string and move to the next string within the same subset
            String combo_y = currString;
            makeStrings(wordSet, collection, combo_y, x_pos , y_pos  + 1);
        }
    }

void makeStrings（字符串[][]字集，ArrayList集合）{
makeStrings（单词集、集合、“、0、0）；
}
void makeStrings（字符串[][]字集、ArrayList集合、字符串currString、int x_pos、int y_pos）{
//如果整个集合中没有更多的词集，则添加字符串（这意味着完成了1个组合）
if（x_pos>=wordSet.length）{
collection.add（currString）；
返回；
}
//否则，如果y_pos超出范围（意味着在较小的集合{…}中没有更多的单词），则返回
如果（y_pos>=wordSet[x_pos].length）{
返回；
} 
否则{
//生成2个新字符串，一个用于“垂直”发送，另一个用于“水平”发送
//此字符串在x.y处接受当前单词，然后移动到下一个单词子集
字符串组合x=currString+“”+wordSet[x_pos][y_pos]；
makeStrings（单词集、集合、组合词x、x位置+1,0）；
//创建该字符串的副本并移动到同一子集内的下一个字符串
字符串组合_y=当前字符串；
makeStrings（单词集、集合、组合、x位置、y位置+1）；
}
}

*编辑以进行更正

$ for q in {I,We}\ want\ {2,3,4}\ {apples,pears}; do echo "$q" ; done
I want 2 apples
I want 2 pears
I want 3 apples
I want 3 pears
I want 4 apples
I want 4 pears
We want 2 apples
We want 2 pears
We want 3 apples
We want 3 pears
We want 4 apples
We want 4 pears

到目前为止，我找到的解决这个问题的最有效的方法是Python模块

sre_收益率的目标是有效地生成所有能够匹配给定的正则表达式，或计数可能的匹配项效率高

我增加了重点

要将其应用于您所述的问题：将模板表述为regex模式，并在sre_yield中使用它来获得所有可能的组合或计算可能的匹配项，如下所示：

import sre_yield
result = set(sre_yield.AllStrings("(I|We) want (|2|3|4) (apples|pears)"))
result.__len__()
result

输出：

16
{'I want  apples',
 'I want  pears',
 'I want 2 apples',
 'I want 2 pears',
 'I want 3 apples',
 'I want 3 pears',
 'I want 4 apples',
 'I want 4 pears',
 'We want  apples',
 'We want  pears',
 'We want 2 apples',
 'We want 2 pears',
 'We want 3 apples',
 'We want 3 pears',
 'We want 4 apples',
 'We want 4 pears'}

PS：我使用

集合

来避免重复，而不是项目页面上显示的

列表。
原则是：

正则表达式->NFA
NFA->最小DFA
DFS遍历DFA（收集所有字符）

这一原则在以下方面得到实施：
这是一种编程语言的工作，而不仅仅是正则表达式。Google It.N/a for regex.听起来像是嵌套循环。上面的正则表达式可以保存为有限树，深度优先搜索给出循环中所有可能的字符串，C伪代码类似于pI[]={'I'，'We'}；pJ[]={'，'2'，'3'，'4'}；pK[]={'apples'，'pears'}；for（i=0；i<2；i++）{string strI=pI[i]；strI+='want'；for（j=0；j<4；j++）{strJ=strI+pJ[j]；for（k=0；k<2；k++）{strK=strJ+pK[k]；print（strK）；}}}@sln模板是输入的（不是常量，只是示例）很抱歉，格式设置有困难。另外，如果您需要索引，您可以生成另一个字符串[]，该字符串将在添加特定单词时添加当前索引[y]
16
{'I want  apples',
 'I want  pears',
 'I want 2 apples',
 'I want 2 pears',
 'I want 3 apples',
 'I want 3 pears',
 'I want 4 apples',
 'I want 4 pears',
 'We want  apples',
 'We want  pears',
 'We want 2 apples',
 'We want 2 pears',
 'We want 3 apples',
 'We want 3 pears',
 'We want 4 apples',
 'We want 4 pears'}

DeterministicAutomaton dfa = Pattern.compileGenericAutomaton("(I|We) want (2|3|4)? (apples|pears)")
  .toAutomaton(new FromGenericAutomaton.ToMinimalDeterministicAutomaton());
if (dfa.getProperty().isAcyclic()) {
  for (String s : dfa.getSamples(1000)) {
    System.out.println(s);
  }
}