Parsing 正在验证一个";“中断”;带有递归下降解析器的语句
在中,我们使用递归下降解析器实现了一种小型编程语言。除此之外,它还有以下声明:Parsing 正在验证一个";“中断”;带有递归下降解析器的语句,parsing,syntax,break,formal-languages,recursive-descent,Parsing,Syntax,Break,Formal Languages,Recursive Descent,在中,我们使用递归下降解析器实现了一种小型编程语言。除此之外,它还有以下声明: statement → exprStmt | ifStmt | printStmt | whileStmt | block ; block → "{" declaration* "}" ; whileStmt → "while" "(" expression ")" statement ; ifStmt → "if" "(
statement → exprStmt
| ifStmt
| printStmt
| whileStmt
| block ;
block → "{" declaration* "}" ;
whileStmt → "while" "(" expression ")" statement ;
ifStmt → "if" "(" expression ")" statement ( "else" statement )? ;
其中一个练习是学习语言。此外,将此语句置于循环之外应该是一个语法错误。当然,如果在循环中,它可以出现在其他块、if
语句等中
我的第一个方法是创建一个新规则,whileBody
,以接受break
:
## FIRST TRY
statement → exprStmt
| ifStmt
| printStmt
| whileStmt
| block ;
block → "{" declaration* "}" ;
whileStmt → "while" "(" expression ")" whileBody ;
whileBody → statement
| break ;
break → "break" ";" ;
ifStmt → "if" "(" expression ")" statement ( "else" statement )? ;
## SECOND TRY
statement → exprStmt
| ifStmt
| printStmt
| whileStmt
| block ;
block → "{" declaration* "}" ;
whileStmt → "while" "(" expression ")" whileBody ;
whileBody → statement
| break
| whileBlock
| whileIfStmt
whileBlock→ "{" (declaration | break)* "}" ;
whileIfStmt → "if" "(" expression ")" whileBody ( "else" whileBody )? ;
break → "break" ";"
ifStmt → "if" "(" expression ")" statement ( "else" statement )? ;
但是我们必须在嵌套循环中接受break
,if
条件等。我可以想象的是,我需要一个新规则来处理接受break
的块和条件:
## FIRST TRY
statement → exprStmt
| ifStmt
| printStmt
| whileStmt
| block ;
block → "{" declaration* "}" ;
whileStmt → "while" "(" expression ")" whileBody ;
whileBody → statement
| break ;
break → "break" ";" ;
ifStmt → "if" "(" expression ")" statement ( "else" statement )? ;
## SECOND TRY
statement → exprStmt
| ifStmt
| printStmt
| whileStmt
| block ;
block → "{" declaration* "}" ;
whileStmt → "while" "(" expression ")" whileBody ;
whileBody → statement
| break
| whileBlock
| whileIfStmt
whileBlock→ "{" (declaration | break)* "}" ;
whileIfStmt → "if" "(" expression ")" whileBody ( "else" whileBody )? ;
break → "break" ";"
ifStmt → "if" "(" expression ")" statement ( "else" statement )? ;
现在还不是不可行,但一旦语言发展起来,处理它可能会很麻烦。即使在今天,写作也很无聊,而且容易出错
我在和BNF规范中寻找灵感。显然,这些规范都不禁止中断
外部循环。我猜他们的解析器有专门的代码来防止这种情况。于是,我照做了
TL;博士
我的问题是:
break
语句break
命令break
语句虽然
不是唯一的循环构造,我使用了一种不同的方法来描述备选方案,包括在可能包含break
语句的非终端名称中添加\u B
declaration → varDecl
| statement
declaration_B → varDecl
| statement_B
statement → exprStmt
| ifStmt
| printStmt
| whileStmt
| block
statement_B → exprStmt
| printStmt
| whileStmt
| breakStmt
| ifStmt_B
| block_B
breakStmt → "break" ";"
ifStmt → "if" "(" expression ")" statement ( "else" statement )?
ifStmt_B → "if" "(" expression ")" statement_B ( "else" statement_B )?
whileStmt → "while" "(" expression ")" statement_B ;
block → "{" declaration* "}"
block_B → "{" declaration_B* "}"
并非所有语句类型都需要复制。非复合语句,如exprStmt
don,因为它们不可能包含break
语句(或任何其他语句类型)。而语句
是循环语句(如whilestt
)的目标,它始终可以包含break
,而不管while
是否在循环中
调用的任何解析函数whilesmt
都会将该参数设置为True
(或一个指示中断是可能的枚举),而其他语句类型只会传递该参数,顶级解析函数会将该参数设置为False
。如果使用False
调用breakStmt
实现,它将返回failure
break
语句虽然
不是唯一的循环构造,我使用了一种不同的方法来描述备选方案,包括在可能包含break
语句的非终端名称中添加\u B
declaration → varDecl
| statement
declaration_B → varDecl
| statement_B
statement → exprStmt
| ifStmt
| printStmt
| whileStmt
| block
statement_B → exprStmt
| printStmt
| whileStmt
| breakStmt
| ifStmt_B
| block_B
breakStmt → "break" ";"
ifStmt → "if" "(" expression ")" statement ( "else" statement )?
ifStmt_B → "if" "(" expression ")" statement_B ( "else" statement_B )?
whileStmt → "while" "(" expression ")" statement_B ;
block → "{" declaration* "}"
block_B → "{" declaration_B* "}"
并非所有语句类型都需要复制。非复合语句,如exprStmt
don,因为它们不可能包含break
语句(或任何其他语句类型)。而语句
是循环语句(如whilestt
)的目标,它始终可以包含break
,而不管while
是否在循环中
由于这是一个自顶向下(递归下降)的解析器,所以在解析器的执行中处理这种情况非常简单。您只需要为每个(或多个)解析函数添加一个参数,该参数指定是否可以中断。
调用的任何解析函数whilesmt
都会将该参数设置为True
(或一个指示中断是可能的枚举),而其他语句类型只会传递该参数,顶级解析函数会将该参数设置为False
。如果用False
调用breakStmt
实现,那么它只会返回failure,而属性语法擅长这类事情。定义一个继承属性(我将其称为LC for loop count)。“程序”非终端将LC=0传递给其子级;循环将LC=$LC+1传递给其子循环;所有其他构造都将LC=$LC传递给其子级。仅当$LC>0时,使“break”规则在语法上有效
对于属性语法或在guards中使用属性值(正如我建议的“break”)没有标准语法,但是使用Prolog definite子句语法表示法,您的语法可能如下所示。我已经添加了一些关于DCG符号的注释,以防您使用它们太久了
/* nt(X) means, roughly, pass the value X as an inherited attribute.
** In a recursive-descent system, it can be passed as a parameter.
** N.B. in definite-clause grammars, semicolon separates alternatives,
** and full stop ends a rule.
*/
/* DCD doesn't have regular-right-part rules, so we have to
** handle repetition via recursion.
*/
program -->
statement(0);
statement(0), program.
statement(LC) -->
exprStmt(LC);
ifStmt(LC);
printStmt(LC);
whileStmt(LC);
block(LC);
break(LC).
block(LC) -->
"{", star-declaration(LC), "}".
/* The notation [] denotes the empty list, and matches zero
** tokens in the input.
*/
star-declaration(LC) -->
[];
declaration(LC), star-declaration(LC).
/* On the RHS of a rule, braces { ... } contain Prolog code. Here,
** the code "LC2 is LC + 1" adds 1 to LC and binds LC2 to that value.
*/
whileStmt(LC) -->
{ LC2 is LC + 1 }, "while", "(", expression(LC2), ")", statement(LC2).
ifStmt(LC) --> "if", "(", expression(LC), ")", statement(LC), opt-else(LC).
opt-else(LC) -->
"else", statement(LC);
[].
/* The definition of break checks the value of the loop count:
** "LC > 0" succeeds if LC is greater than zero, and allows the
** parse to succeed. If LC is not greater than zero, the expression
** fails. And since there is no other rule for 'break', any attempt
** to parse a 'break' rule when LC = 0 will fail.
*/
break(LC) --> { LC > 0 }, "break", ";".
在Grune和Jacobs中可以找到关于属性语法的很好的介绍,这是一种解析技术