ANTLR模糊解析

ANTLR模糊解析,antlr,antlr3,Antlr,Antlr3,我正在ANTLRv3中构建一种预处理器,它当然只适用于模糊解析。目前,我正试图解析include语句,并用相应的文件内容替换它们。我用了这个例子: 基于此示例,我编写了以下代码: grammar preprocessor; options { language='Java'; } @lexer::header { package antlr_try_1; } @parser::header { package antlr_try_1; } parse : (t=. {

我正在ANTLRv3中构建一种预处理器,它当然只适用于模糊解析。目前,我正试图解析include语句,并用相应的文件内容替换它们。我用了这个例子:

基于此示例,我编写了以下代码:

grammar preprocessor;

options {
    language='Java';
}

@lexer::header {

package antlr_try_1;

}

@parser::header {

package antlr_try_1;

}

parse
 : (t=. {System.out.print($t.text);})* EOF
 ;

INCLUDE_STAT
 : 'include' (' ' | '\r' | '\t' | '\n')+ ('A'..'Z' | 'a'..'z' | '_' | '-' | '.')+
   {
     setText("Include statement found!");
   }
 ;

Any
 : . // fall through rule, matches any character
 ;
此语法仅用于打印文本并用“include statement found!”字符串替换include语句。要分析的示例文本如下所示:

some random input
some random input
some random input

include some_file.txt

some random input
some random input
some random input
结果的输出如下所示:

C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 1:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 2:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 3:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 7:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 8:14 mismatched character 'p' expecting 'c'
C:\Users\andriyn\Documents\SandBox\text_files\asd.txt line 9:14 mismatched character 'p' expecting 'c'
some random ut
some random ut
some random ut

Include statement found!

some random ut
some random ut
some random ut
据我判断,它被“input”一词中的“in”所迷惑,因为它“认为”它将是INCLUDE_STAT标记


有更好的方法吗?我不能使用filter选项,因为我不仅需要include语句,还需要其余的代码。我尝试了其他几种方法,但找不到合适的解决方案。

您正在观察ANTLR 3的一个局限性。您可以使用以下任一选项来解决眼前的问题:

  • 升级到ANTLR 4,它没有此限制
  • Include_STAT
    规则的开头包含以下语法谓词:

    `('include' (' ' | '\r' | '\t' | '\n')+ ('A'..'Z' | 'a'..'z' | '_' | '-' | '.')+) =>`
    

  • 我刚刚尝试了这个语法谓词,但没有成功——结果看起来是一样的。好吧,我决定转到ANTLR4。我的第一个经验是,模糊解析在那里非常有效,所以我接受你的答案。