ANTLR语法标记问题（ANTLR工作）_Antlr

ANTLR语法标记问题（ANTLR工作）

antlr

ANTLR语法标记问题（ANTLR工作）,antlr,Antlr,我是ANTLR的业余爱好者，我正在为一个简单的处理器创建一个解释器，我遇到了一个值令牌抛出错误的小问题。我是一名学生，所以我不是要求你帮我做家庭作业……我已经基本完成了（包括口译员的所有课堂文件），但这一问题正在击败我，尽管它可能很简单，而且就在我面前 ANTLR works一直给我这个控制台错误消息 “错误（208）：newExpr.g:193:1:无法匹配以下令牌定义，因为先前的令牌与相同的输入匹配：值” 很明显，值的正则表达式有问题，但我看不出它是什么，无论是在语法中还是在语法中的其他任何

我是ANTLR的业余爱好者，我正在为一个简单的处理器创建一个解释器，我遇到了一个值令牌抛出错误的小问题。我是一名学生，所以我不是要求你帮我做家庭作业……我已经基本完成了（包括口译员的所有课堂文件），但这一问题正在击败我，尽管它可能很简单，而且就在我面前

ANTLR works一直给我这个控制台错误消息

“错误（208）：newExpr.g:193:1:无法匹配以下令牌定义，因为先前的令牌与相同的输入匹配：值”

很明显，值的正则表达式有问题，但我看不出它是什么，无论是在语法中还是在语法中的其他任何地方。如果你能指出我遗漏了什么，我将不胜感激……因为谷歌搜索并没有真正帮助我找到我自己语法中的错误

grammar newExpr;

options 
{
    language=Java;
}

@header 
{
    import java.util.*;
}

@members 
{
    ArrayList myInitialise = new ArrayList();
    ArrayList InstructionList = new ArrayList();
}

/*--------------------------------------------------------------------------------------------------------------------------------*
 * PARSER RULES                                                                                                                   *
 *--------------------------------------------------------------------------------------------------------------------------------*//

/*
* prog is where the interpretation beings and consists of one or more (+) 'stat' rules
*/
prog        :       stat+;

/*
* stat rules are the general parse rules of entire operations on the processor.
* They consist of smaller data operations rules (dataop) or memory operations (memop).
*/                
stat        :       BASIC r1=REG c1=COMMA r2=REG c2=COMMA dataop NEWLINE
            {
                int reg1 = Integer.parseInt($r1.text.substring(1));  // these lines convert the token input stream and converts to an actual integer
                int reg2 = Integer.parseInt($r2.text.substring(1)); 
                int IMDT = $dataop.value;    // take the immediate integer

                // LOAD operation
                if($BASIC.text.equals("LD"))
                InstructionList.add(new ld(reg1, reg2, IMDT));

                // STORE operation  
                else if($BASIC.text.equals("ST"))
                InstructionList.add(new st(reg1, reg2, IMDT));

                // SUBTRACTION operation    
                else if($BASIC.text.equals("SUB"))
                InstructionList.add(new sub(reg1, reg2, IMDT));

                // ADDITION operation   
                else if($BASIC.text.equals("ADD"))
                InstructionList.add(new add(reg1, reg2, IMDT));

                // MULTIPLICATION operation 
                else if($BASIC.text.equals("MUL"))
                InstructionList.add(new mul(reg1, reg2, IMDT));

                // DIVISION operation   
                else if($BASIC.text.equals("DIV"))
                InstructionList.add(new div(reg1, reg2, IMDT));
            }

            | 

            i1 = INDEX '=' memop NEWLINE
            {
                myInitialise.add(new memInit(Integer.parseInt($i1.text), $dataop.value));
            }

            |

            JUMP REG COMMA dataop NEWLINE
            {
                int R = Integer.parseInt($REG.text.substring(1));
                int val = $dataop.value;

                // BRANCH EQUAL operation
                if($JUMP.text.equals("BEZ"))
                InstructionList.add(new branchEqualZero(R,value));

                // BRANCH NOT EQUAL operation
                else if($JUMP.text.equals("BNEZ"))
                InstructionList.add(new branchNotEqualZero(R,value));
            }

            | 

            JUMP REG NEWLINE
            {
                int R = Integer.parseInt($REG.text.substring(1));
                InstructionList.add(new jump(R));
            }

            | 

            HALT 
            {
                InstructionList.add(new halt());
            }
            ;


dataop returns [int value] 

        :   INDEX
            {
                $value = Integer.parseInt($INDEX.text);
            }

            |   

            VALUE
            {
                $value = Integer.parseInt($VALUE.text.substring(1))*-1;
            };


memop returns [int value]

        :   INDEX
            {
                $value = Integer.parseInt($INDEX.text);
            }

            |

            VALUE
            {
                $value = Integer.parseInt($VALUE.text.substring(1))*-1;
            }

            |

            MEMVAL
            {
                if($MEMVAL.text.startsWith("-"))
                {
                    $value = Integer.parseInt($MEMVAL.text.substring(1))*-1;
                }
                else
                    $value = Integer.parseInt($MEMVAL.text);
            };


/*--------------------------------------------------------------------------------------------------------------------------------*
 * LEXER RULES                                                                                                                    *
 *--------------------------------------------------------------------------------------------------------------------------------*/

/*
* RegExps for BASIC instructions (load, store, add, subtract, multiply, divide
*/
BASIC       :   ('L' 'D') | ('S' 'T') | ('A' 'D' 'D') | ('S' 'U' 'B') | ('M' 'U' 'L') | ('D' 'I' 'V');

/*
* The comma is simply for syntactic purposes, to separate data and register references
*/
COMMA       :   ',';

/*
* Regular Expressions for the processor registers R0-R31
*/
REG         :   ('R') (('0'..'9') | ('0'..'2') ('0'..'9') | ('3') ('0'..'1') );

/*
* 'Index' is the set of regular expressions matching memory locations
*/
INDEX       :       ('0'..'9')                    
            |
            ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'5') ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('0'..'4') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('0'..'4') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('5') ('0'..'2') ('0'..'9')
            |
            ('6') ('5') ('5') ('3') ('0'..'5');

/*
* Reg Exps for memory initialisation instructions
*/
MEMVAL      :   ('0'..'9')+ | '-' ('0'..'9')+;            

/*
* Simple integers for data values
*/          
VALUE       :   '-' (('0'..'9')         **PROBLEM IS HERE**
            |
            ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('0'..'5') ('0'..'9') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('0'..'4') ('0'..'9') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('0'..'4') ('0'..'9') ('0'..'9')
            |
            ('6') ('5') ('5') ('0'..'2') ('0'..'9')
            |
            ('6') ('5') ('5') ('3') ('0'..'6'));

/*
* Regular Expressions for return/newline characters
*/ 
NEWLINE     :   '\r'? '\n' ;


/*
* This simply makes the interpreter tolerant to whitespace
*/
WHITESPACE      :   (' ' | '\t' | '\u000C')+ {skip();};

/*
* RegExp for Branch on Equal to Zero/Branch on Not Equal to Zero instructions
*/
BRANCH      :   ('B' 'E' 'Z') | ('B' 'N' 'E' 'Z');

/*
* RegExp for jump instruction
*/
JUMP        :   ('J' 'R');

/*
* The HALT instruction ends the program and executes all instructions
* in the Instruction List on the data/values that have been entered
*/
HALT        :   ('H' 'A' 'L' 'T');

ANTLR生成的lexer是这样工作的：它尝试尽可能多地匹配，当两个（或更多）规则匹配相同数量的字符时，首先定义的规则将“获胜”。因此，您的

值

规则永远无法从

MEMVAL

规则中“获胜”，因为与

值

匹配的所有内容也都与

MEMVAL

匹配：

'-'（'0'..'9'）+

因此，您会看到错误消息

如果您的解析器规则之一在某一时刻可能需要一个

值

标记，则lexer只会根据我提到的规则生成一个标记：lexer不考虑来自解析器的任何信息

只需删除

值

规则并将其替换为

MEMVAL

（或者将

MEMVAL

重命名为

INT

）。然后在您的解析器规则中，只需匹配

MEMVAL

（或

INT

）并检查该值是否在特定的数字范围内。

啊，对……我不知道它是这样处理的，但这是有意义的。谢谢，你帮了我的忙！