Algorithm 单字符括号匹配_Algorithm_Parsing_Grammar_Ambiguous Grammar

Algorithm 单字符括号匹配

algorithm parsing

Algorithm 单字符括号匹配,algorithm,parsing,grammar,ambiguous-grammar,Algorithm,Parsing,Grammar,Ambiguous Grammar,给定语法规则（BNF，|表示或）：，与 +左联想（a+a+a表示（a+a）+a）串联左关联（aaa表示（aa）a，而不是a（aa））而+懒洋洋地吃操作数（aa+aa表示a（a+a）a）问题：此语法是否有歧义？即，是否可以用两种不同的方式解析字符串示例：允许：a，a+a，a+“a”，“a+a”+“a+a”（读作（a+a）+（a+a），“a”+“a”（读作（（a）+（a））+（a），a+a，a+a 禁止：“a+a”，+“a”，a++a，“a”，a+“a”，“a+a”+a” 应用程序：

给定语法规则（BNF，

表示或）：

，与

```
+
```
左联想（
```
a+a+a
```
表示
```
（a+a）+a
```
）
串联左关联（
```
aaa
```
表示
```
（aa）a
```
，而不是
```
a（aa）
```
）
而
```
+
```
懒洋洋地吃操作数（
```
aa+aa
```
表示
```
a（a+a）a
```
）

问题：此语法是否有歧义？即，是否可以用两种不同的方式解析字符串

示例：

允许：

，

a+a

，

a+“a”

，

“a+a”+“a+a”

（读作

（a+a）+（a+a）

，

“a”+“a”

（读作

（（a）+（a））+（a）

，

a+a

，

a+a

禁止：

“a+a”

，

+“a”

，

a++a

，

“a”

，

a+“a”

，

“a+a”+a”

应用程序：我不喜欢在LaTeX中转义

和

，因此我想制作一种只需转义一个字符的LaTeX方言，这样就可以用一个字符

来替换{
和}
，例如，写一些类似“1+2”/3”^“a+b”
的东西，而不是{\frac{1+2}{3}}{a+b}
是一个快速而肮脏的脚本，它使用一个接口来解析输入，并使用您提供的语法及其修改版本，该接口支持懒吃和左assoc，但不禁止“a”：

对于您提供的输入，语法并不含糊不清，否则会引发异常
希望这有帮助
另外，使用Marpa的通用BNF解析功能为TeX（以及其他）提供具有更好语法的前端
更新：回复询问者的评论
此语法（在中，| |表示优先级较低）
明确地解析问题中的输入，除了“a+a”+“a+a”
，对于这些输入，可能需要“x”
替代方案（这将使语法模棱两可，正如rici在下面的评论中有益地建议的那样，在下一段中详细介绍）：
总的来说，使用双引号“充当paren，”+“as，well，plus，很容易为一个优先级低于“+”的op添加一个符号，例如“concatenation”，使其成为一个经典的术语/因子语法，可以在Marpa SLIF DSL中表达如下：
x ::= a
  || '"' x '"' assoc => group
  || x '+' x
  || x '.' x

更新1：
# input: "a+a"+"a+a"
Setting trace_terminals option
Lexer "L0" accepted lexeme L1c1 e1: '"'; value="""
Lexer "L0" accepted lexeme L1c1 e1: '"'; value="""
Lexer "L0" accepted lexeme L1c2 e2: a; value="a"
Lexer "L0" accepted lexeme L1c3 e3: '+'; value="+"
Lexer "L0" accepted lexeme L1c3 e3: '+'; value="+"
Lexer "L0" accepted lexeme L1c4 e4: a; value="a"
Lexer "L0" accepted lexeme L1c5 e5: '"'; value="""
Lexer "L0" accepted lexeme L1c5 e5: '"'; value="""
Lexer "L0" accepted lexeme L1c6 e6: '+'; value="+"
Lexer "L0" accepted lexeme L1c6 e6: '+'; value="+"
Lexer "L0" accepted lexeme L1c7 e7: '"'; value="""
Lexer "L0" accepted lexeme L1c8 e8: a; value="a"
Error in SLIF parse: No lexeme found at line 1, column 9
* String before error: "a+a"+"a
* The error was at line 1, column 9, and at character 0x002b '+', ...
* here: +a"
Marpa::R2 exception at C:\cygwin\home\Ruslan\Marpa-R2-work\q27655176.t line 63.

Progress report is:
F3 @7-8 L1c7-8 x -> a .
R7:6 @0-8 L1c1-8 x -> '"' x '"' '+' '"' x . '"'
# ast dump:
undef

“frac”1+2“3”^“a+b”
-哎哟！除了模糊性的问题，你希望如何能够直观地解析它？是的，这是我的另一个问题：），如何设计一个算法来确定”
是开始的括号，什么是结束的括号。但我以为有编译器可以做到这一点。我不是在考虑计算机，我是在考虑人类读者。很好的引语！但也许当编辑器自动为嵌套更深的组使用较深的灰色背景时，“…”
可以将这个代码动态地解释为一个人。甚至x:=x x | a是一个模糊的CFG。如果你允许x->“x”
那么你就有了模糊性，因为“a”a“
可以被解析为{a{a}
或{a}a}
。（将a
s替换为涉及+
的更复杂表达式，以查看更多有趣的歧义。）我没有让python来测试代码，但如果像我在问题中所做的那样修改语法，它是否有效/是否有歧义？Antll抱怨语法x:'a'| x | x'+'x | x'+'x | x'+''''''x''x''“|”“‘x’”+‘x |”“‘x’”+“‘x’”””是左递归的，Gold解析器只是拒绝“a++“a”+a。更新发布了，希望它能回答你的问题。与其他人不同，Marpa会解析你能用BNF表达的所有东西，包括左、右和中递归。这是一种努力。对不起我的愚蠢，“a+a”是怎么回事+“a+a”
模棱两可？虽然您的代码给出了一个错误，但我只能想到一种方法来解析它：（['x'，“'”，['x'，['a'，'a']]，“+'，['x'，['a'，'a']]，“'”，“'”，['x'，['a'，['a']，['x'，['a']，['a'，'a']]，”），或者用更自然的（a+a+a）表示法“-如果我们允许x:：=a
和'+'规则具有同等的优先级，则可以对其进行分析-
x ::= a
  || '"' x '"' assoc => group
  || x '+' x
  || x '.' x

# input: "a+a"+"a+a"
Setting trace_terminals option
Lexer "L0" accepted lexeme L1c1 e1: '"'; value="""
Lexer "L0" accepted lexeme L1c1 e1: '"'; value="""
Lexer "L0" accepted lexeme L1c2 e2: a; value="a"
Lexer "L0" accepted lexeme L1c3 e3: '+'; value="+"
Lexer "L0" accepted lexeme L1c3 e3: '+'; value="+"
Lexer "L0" accepted lexeme L1c4 e4: a; value="a"
Lexer "L0" accepted lexeme L1c5 e5: '"'; value="""
Lexer "L0" accepted lexeme L1c5 e5: '"'; value="""
Lexer "L0" accepted lexeme L1c6 e6: '+'; value="+"
Lexer "L0" accepted lexeme L1c6 e6: '+'; value="+"
Lexer "L0" accepted lexeme L1c7 e7: '"'; value="""
Lexer "L0" accepted lexeme L1c8 e8: a; value="a"
Error in SLIF parse: No lexeme found at line 1, column 9
* String before error: "a+a"+"a
* The error was at line 1, column 9, and at character 0x002b '+', ...
* here: +a"
Marpa::R2 exception at C:\cygwin\home\Ruslan\Marpa-R2-work\q27655176.t line 63.

Progress report is:
F3 @7-8 L1c7-8 x -> a .
R7:6 @0-8 L1c1-8 x -> '"' x '"' '+' '"' x . '"'
# ast dump:
undef