Parsing and/or逻辑的Antlr解析器-如何在逻辑运算符之间获得表达式？_Parsing_Antlr_Antlr3

Parsing and/or逻辑的Antlr解析器-如何在逻辑运算符之间获得表达式？

parsing antlr

Parsing and/or逻辑的Antlr解析器-如何在逻辑运算符之间获得表达式？,parsing,antlr,antlr3,Parsing,Antlr,Antlr3,我正在使用ANTLR创建和/或解析器+计算器。表达式的格式如下： x等式1和&y等式10 （x lt 10&x gt 1）或x eq-1 我正在阅读这篇关于ANTLR中逻辑表达式的文章，我发现这里发布的语法是一个很好的开端： grammar Logic; parse : expression EOF ; expression : implication ; implication : or ('->' or)* ; or : and ('&a

我正在使用ANTLR创建和/或解析器+计算器。表达式的格式如下：

```
x等式1和&y等式10
```
```
（x lt 10&x gt 1）或x eq-1
```

我正在阅读这篇关于ANTLR中逻辑表达式的文章，我发现这里发布的语法是一个很好的开端：

grammar Logic;

parse
  :  expression EOF
  ;

expression
  :  implication
  ;

implication
  :  or ('->' or)*
  ;

or
  :  and ('&&' and)*
  ;

and
  :  not ('||' not)*
  ;

not
  :  '~' atom
  |  atom
  ;

atom
  :  ID
  |  '(' expression ')'
  ;

ID    : ('a'..'z' | 'A'..'Z')+;
Space : (' ' | '\t' | '\r' | '\n')+ {$channel=HIDDEN;};

然而，当从解析器获取一棵树时，对于变量只有一个字符（即a | | B）和C）的表达式，我很难适应我的情况（在示例

“x eq 1和&y eq 10”

中，我希望有一个

“和”

父项和两个子项，

“x eq 1”

和

“y eq 10”

，请参见下面的测试用例）

我相信这与

“ID”

有关。正确的语法是什么

ID    : ('a'..'z' | 'A'..'Z')+;

表示标识符是一个或多个字母的序列，但不允许有任何数字。请重试

ID    : ('a'..'z' | 'A'..'Z' | '0'..'9')+;

这将允许例如

abc

、

12ab

和

ab12

。如果您不想要后一种类型，则必须稍微重新构造规则（留作挑战…）

为了接受任意多个标识符，您可以将

atom

定义为

ID+

而不是

ID

此外，您可能需要指定

和

，

或

，

->

和

作为标记，这样，正如@Bart Kiers所说，前两个将不会被归类为

ID

，后两个将被完全识别。

对于那些感兴趣的人，我在我的语法文件中做了一些改进（见下文）

目前的限制：

仅适用于&&/| |，不适用于和/或（问题不大）

括号和&&/| |之间不能有空格（在输入lexer之前，我通过替换源字符串中的“（“with”）”和“）“with”）来解决这个问题）

语法逻辑

options {
  output = AST;
}

tokens {
  AND = '&&';
  OR  = '||';
  NOT = '~';
}

// parser/production rules start with a lower case letter
parse
  :  expression EOF!    // omit the EOF token
  ;

expression
  :  or
  ;

or
  :  and (OR^ and)*    // make `||` the root
  ;

and
  :  not (AND^ not)*      // make `&&` the root
  ;

not
  :  NOT^ atom    // make `~` the root
  |  atom
  ;

atom
  :  ID
  |  '('! expression ')'!    // omit both `(` and `)`
  ;

// lexer/terminal rules start with an upper case letter
ID
  :
    (
    'a'..'z'
    | 'A'..'Z'
    | '0'..'9' | ' '
    | SYMBOL
  )+ 
  ;

SYMBOL
  :
    ('+'|'-'|'*'|'/'|'_')
 ;

@Aasmund，我认为这没有考虑两件事：一个令牌可以有空格（“绿色neq绿色”应该是一个令牌）。此外，使用AND/OR而不是&&/| |，难道不需要在ID中说~AND&&~或（或~-“和”、~“或”）？@AasmundEldhuset：如果你将原子更改为ID+，但不在ID中添加“|””，你不是说你期望IDID…ID，即1eq1（没有空格）吗？该语法将1个eq 1 | | | B标记为父| |和4个子（1，eq，1，B），而不是2个子（1个eq 1，B）

options {
  output = AST;
}

tokens {
  AND = '&&';
  OR  = '||';
  NOT = '~';
}

// parser/production rules start with a lower case letter
parse
  :  expression EOF!    // omit the EOF token
  ;

expression
  :  or
  ;

or
  :  and (OR^ and)*    // make `||` the root
  ;

and
  :  not (AND^ not)*      // make `&&` the root
  ;

not
  :  NOT^ atom    // make `~` the root
  |  atom
  ;

atom
  :  ID
  |  '('! expression ')'!    // omit both `(` and `)`
  ;

// lexer/terminal rules start with an upper case letter
ID
  :
    (
    'a'..'z'
    | 'A'..'Z'
    | '0'..'9' | ' '
    | SYMBOL
  )+ 
  ;

SYMBOL
  :
    ('+'|'-'|'*'|'/'|'_')
 ;