Parsing Antlr3-带转义双引号的非贪婪双引号字符串_Parsing_Antlr_Antlr3_Lexer

Parsing Antlr3-带转义双引号的非贪婪双引号字符串

parsing antlr

Parsing Antlr3-带转义双引号的非贪婪双引号字符串,parsing,antlr,antlr3,lexer,Parsing,Antlr,Antlr3,Lexer,下面的Antlr3语法文件不支持作为字符串lexer规则一部分的转义双引号。你知道为什么吗工作用语： \“你好” ref（\'hello\'，\'hello\'））表达式不起作用： \“h\'e\'l\'l\'o” 参考（\'hello\'，\'Hell\'lo\'）） Antlr3语法文件在AntlrWorks中可运行： grammar Grammar; options { output=AST; ASTLabelType=CommonTree; lang

下面的Antlr3语法文件不支持作为字符串lexer规则一部分的转义双引号。你知道为什么吗

工作用语：

\“你好”
ref（\'hello\'，\'hello\'））

表达式不起作用：

\“h\'e\'l\'l\'o”
参考（\'hello\'，\'Hell\'lo\'））

Antlr3语法文件在AntlrWorks中可运行：

grammar Grammar;

options
{
    output=AST;
    ASTLabelType=CommonTree;
    language=CSharp3;
}

public oaExpression
   : exponentiationExpression EOF!
   ;

exponentiationExpression
    :       equalityExpression ( '^' equalityExpression )*
    ;

equalityExpression
    :       relationalExpression ( ( ('==' | '=' ) | ('!=' | '<>' ) ) relationalExpression )*
    ;

relationalExpression
    :       additiveExpression ( ( '>' | '>=' | '<' | '<=' ) additiveExpression )*
    ;

additiveExpression
    :       multiplicativeExpression ( ( '+' | '-' ) multiplicativeExpression )*
    ;

multiplicativeExpression
    :       primaryExpression ( ( '*' | '/' ) primaryExpression )*
    ;

primaryExpression
    :       '(' exponentiationExpression ')' | value | identifier (arguments )?
    ;

value
    :       STRING
    ;

identifier
    :       ID
    ;

expressionList
    :       exponentiationExpression ( ',' exponentiationExpression )*
    ;

arguments
    :       '(' ( expressionList )? ')'
    ;                      

/*
 * Lexer rules
 */

ID
    :       LETTER (LETTER | DIGIT)*
    ;

STRING
    :       '"' ( options { greedy=false; } : ~'"' )* '"'
    ;

WS
    :       (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=Hidden;}
    ;

/*
 * Fragment Lexer rules
 */

fragment
LETTER
    :       'a'..'z'
    |       'A'..'Z'
    |       '_'
    ;

fragment
EXPONENT
    :       ('e'|'E') ('+'|'-')? ( DIGIT )+
    ;

fragment
HEX_DIGIT
    :       ( DIGIT |'a'..'f'|'A'..'F')
    ;

fragment
DIGIT
    :       '0'..'9'
    ;

语法；
选择权
{
输出=AST；
ASTLabelType=CommonTree；
语言=CS3；
}
公开表达
：指数表达式EOF！
;
指数表达式
：equalityExpression（“^”equalityExpression）*
;
等式表达式
：relationalExpression（（“==”|“=”）|（“！=”|“））relationalExpression）*
;
关系表达
：additiveExpression（（“>”|“>=”|“这是我如何处理可以包含转义序列的字符串的方法（不仅仅是\”，而是任何）：
该规则还对转义进行计数并将其存储在令牌中。这允许接收者快速查看是否需要对字符串执行任何操作（如果user1>0）。如果不需要，请删除@init部分和操作。
尝试以下操作：
STRING
 : '"'                          // a opening quote
   (                            // start group
     '\\' ~('\r' | '\n')        // an escaped char other than a line break char
     |                          // OR
     ~('\\' | '"'| '\r' | '\n') // any char other than '"', '\' and line breaks
   )*                           // end group and repeat zero or more times
   '"'                          // the closing quote
 ;

当我测试您评论中的4个不同测试用例时：
"\"hello\""
"ref(\"hello\",\"hello\")"
"\"h\"e\"l\"l\"o\""
"ref(\"hello\", \"hel\"lo\")"

根据lexer规则，我建议：
grammar T;

parse
 : string+ EOF
 ;

string
 : STRING
 ;

STRING
 : '"' ('\\' ~('\r' | '\n') | ~('\\' | '"'| '\r' | '\n'))* '"'
 ;

SPACE
 : (' ' | '\t' | '\r' | '\n')+ {skip();}    
 ;

oaExpression
   :        STRING+ EOF!
   //: exponentiationExpression EOF!
   ;

exponentiationExpression
    :       equalityExpression ( '^' equalityExpression )*
    ;

equalityExpression
    :       relationalExpression ( ( ('==' | '=' ) | ('!=' | '<>' ) ) relationalExpression )*
    ;

relationalExpression
    :       additiveExpression ( ( '>' | '>=' | '<' | '<=' ) additiveExpression )*
    ;

additiveExpression
    :       multiplicativeExpression ( ( '+' | '-' ) multiplicativeExpression )*
    ;

multiplicativeExpression
    :       primaryExpression ( ( '*' | '/' ) primaryExpression )*
    ;

primaryExpression
    :       '(' exponentiationExpression ')' | value | identifier (arguments )?
    ;

value
    :       STRING
    ;

identifier
    :       ID
    ;

expressionList
    :       exponentiationExpression ( ',' exponentiationExpression )*
    ;

arguments
    :       '(' ( expressionList )? ')'
    ;                      

/*
 * Lexer rules
 */

ID
    :       LETTER (LETTER | DIGIT)*
    ;

//STRING
//    :       '"' ( options { greedy=false; } : ~'"' )* '"'
//    ;
STRING
    :       '"' ('\\' ~('\r' | '\n') | ~('\\' | '"'| '\r' | '\n'))* '"'
    ;

WS
    :       (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;} /*{$channel=Hidden;}*/
    ;

/*
 * Fragment Lexer rules
 */

fragment
LETTER
    :       'a'..'z'
    |       'A'..'Z'
    |       '_'
    ;

fragment
EXPONENT
    :       ('e'|'E') ('+'|'-')? ( DIGIT )+
    ;

fragment
HEX_DIGIT
    :       ( DIGIT |'a'..'f'|'A'..'F')
    ;

fragment
DIGIT
    :       '0'..'9'
    ;

ANTLRWorks的调试器生成以下解析树：

换句话说：它工作得很好（在我的机器上：）
编辑二
我还使用了您的语法（做了一些小改动，使其与Java兼容），将不正确的字符串规则替换为我建议的规则：
grammar T;

parse
 : string+ EOF
 ;

string
 : STRING
 ;

STRING
 : '"' ('\\' ~('\r' | '\n') | ~('\\' | '"'| '\r' | '\n'))* '"'
 ;

SPACE
 : (' ' | '\t' | '\r' | '\n')+ {skip();}    
 ;

oaExpression
   :        STRING+ EOF!
   //: exponentiationExpression EOF!
   ;

exponentiationExpression
    :       equalityExpression ( '^' equalityExpression )*
    ;

equalityExpression
    :       relationalExpression ( ( ('==' | '=' ) | ('!=' | '<>' ) ) relationalExpression )*
    ;

relationalExpression
    :       additiveExpression ( ( '>' | '>=' | '<' | '<=' ) additiveExpression )*
    ;

additiveExpression
    :       multiplicativeExpression ( ( '+' | '-' ) multiplicativeExpression )*
    ;

multiplicativeExpression
    :       primaryExpression ( ( '*' | '/' ) primaryExpression )*
    ;

primaryExpression
    :       '(' exponentiationExpression ')' | value | identifier (arguments )?
    ;

value
    :       STRING
    ;

identifier
    :       ID
    ;

expressionList
    :       exponentiationExpression ( ',' exponentiationExpression )*
    ;

arguments
    :       '(' ( expressionList )? ')'
    ;                      

/*
 * Lexer rules
 */

ID
    :       LETTER (LETTER | DIGIT)*
    ;

//STRING
//    :       '"' ( options { greedy=false; } : ~'"' )* '"'
//    ;
STRING
    :       '"' ('\\' ~('\r' | '\n') | ~('\\' | '"'| '\r' | '\n'))* '"'
    ;

WS
    :       (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=HIDDEN;} /*{$channel=Hidden;}*/
    ;

/*
 * Fragment Lexer rules
 */

fragment
LETTER
    :       'a'..'z'
    |       'A'..'Z'
    |       '_'
    ;

fragment
EXPONENT
    :       ('e'|'E') ('+'|'-')? ( DIGIT )+
    ;

fragment
HEX_DIGIT
    :       ( DIGIT |'a'..'f'|'A'..'F')
    ;

fragment
DIGIT
    :       '0'..'9'
    ;

oa表达式
：STRING+EOF！
//：指数表达式EOF！
;
指数表达式
：equalityExpression（“^”equalityExpression）*
;
等式表达式
：relationalExpression（（“==”|“=”）|（“！=”|“））relationalExpression）*
;
关系表达
：additiveExpression（（“>”|“>=”|“为什么？我的意思是规则只在没有结束引号的情况下匹配整个输入，在这种情况下输入是无效的，对吗？你能澄清一下吗？嗨@BartKiers，我编辑了问题以提供完整的语法。我尝试了你的建议，但它们似乎不起作用。我尝试了你的建议，但对ma来说似乎不起作用我已经更新了我的问题以包含完整的语法。嗨@BartKiers，上面的Lexer规则适用于以下场景：“\'hello\”、“ref”（“hello\”、“hello\”）但不适用于转义双引号场景：“\'h\'e\'l\'l\'o\”、“ref”（“hello\”、“hel\'lo\”）"。谢谢您的帮助。@bjrave，我不知道您是如何测试它的，但它工作得很好。祝您好运。我同意您的精简语法在ANTLRWorks中工作得很好。使用ANTLRWorks和通过C代码编译的解析器，字符串Lexer规则在与我的语法文件的其余部分集成后似乎无法工作。我添加了一个指向我的语法的链接r文件，你应该想看一看。@bjrave，不，我不想看：你的语法中充满了嵌入的代码，这意味着我需要过滤掉所有这些，然后才能给它一个旋转。也许如果你在原始问题中发布没有代码的语法，我或其他人可能会看一看。外部链接通常不是looke我明白你的意思。我已经从语法文件中删除了嵌入的代码。