Antlr 忽略下划线字母
如果我尝试在解释器中为以下语法运行“\uuuuuuuuuuuu sad”Antlr 忽略下划线字母,antlr,antlrworks,Antlr,Antlrworks,如果我尝试在解释器中为以下语法运行“\uuuuuuuuuuuu sad” grammar identTest; options { language = Java; output=AST; } goal: identifier; fragment Letter: (('a'..'z') | ('A'..'Z')); fragment Digit : '0' .. '9'; identifier :IDENTIFIER; IDENTIFIER: Le
grammar identTest;
options
{
language = Java;
output=AST;
}
goal: identifier;
fragment Letter: (('a'..'z') | ('A'..'Z'));
fragment Digit : '0' .. '9';
identifier :IDENTIFIER;
IDENTIFIER: Letter+;
WS:(' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;};
解释器输出:
调试器输出:
解释器包含下划线字母,调试器似乎忽略了它!在这种情况下,我希望得到某种例外(因为语法中只定义了'A'-'z'字母)。我的语法出了什么问题?不要使用解释器:它有问题 使用调试器,您可以查看解析器在按下输出按钮(左下角)后产生的警告/错误/异常。执行此操作时,您将看到以下内容:
.../__Test___input.txt line 1:0 no viable alternative at character '_'
.../__Test___input.txt line 1:1 no viable alternative at character '_'
.../__Test___input.txt line 1:2 no viable alternative at character '_'
解析器只是从下划线恢复并继续解析
如果您不想让lexer从这种没有可行的替代警告中恢复,只需创建一个fall-through lexer规则(称为OTHER
)并从中抛出一个异常:
grammar identTest;
options
{
language = Java;
output=AST;
}
goal : identifier;
identifier : IDENTIFIER;
IDENTIFIER : Letter+;
WS : (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;};
OTHER : . {throw new RuntimeException("unknown char: '" + $text + "'");};
fragment Letter : (('a'..'z') | ('A'..'Z'));
fragment Digit : '0' .. '9';