Warning: file_get_contents(/data/phpspider/zhask/data//catemap/6/haskell/8.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Parsing AST的最佳ADT表示_Parsing_Haskell_Compiler Construction_Abstract Syntax Tree - Fatal编程技术网

Parsing AST的最佳ADT表示

Parsing AST的最佳ADT表示,parsing,haskell,compiler-construction,abstract-syntax-tree,Parsing,Haskell,Compiler Construction,Abstract Syntax Tree,对于我试图表示为Haskell ADT的表达式,我有以下语法: Expr = SimpleExpr [OPrelation SimpleExpr] SimpleExpr = [OPunary] Term {OPadd Term} Term = Factor {OPmult Factor} 其中: {}表示0或更多 []表示0或1 OPmult、OPadd、OPrelation和OPunary是运算符的类别 请注意,此语法的优先级是正确的 以下是我尝试过的: data Expr =

对于我试图表示为Haskell ADT的表达式,我有以下语法:

Expr = SimpleExpr [OPrelation SimpleExpr]  
SimpleExpr = [OPunary] Term {OPadd Term}  
Term = Factor {OPmult Factor}  
其中:

{}表示0或更多
[]表示0或1
OPmult、OPadd、OPrelation和OPunary是运算符的类别

请注意,此语法的优先级是正确的

以下是我尝试过的:

data Expr  = Expr SimpleExpr (Maybe OPrelation) (Maybe SimpleExpr)
data SimpleExpr = SimpleExpr (Maybe OPunary) Term [OPadd] [Term]
data Term = Term Factor [OPmult] [Factor]
事后看来,我认为这很糟糕,尤其是[OPadd][Term]和[OPmult][Factor]部分。因为,例如,在1+2+3的解析树中,[+,+]放在一个分支中,[2,3]放在另一个分支中,这意味着它们是解耦的

在编译的下一个阶段中,什么样的表示方法会起到很好的作用

  • 将{}和[]分解为更多的数据类型似乎有些过分
  • 使用列表似乎不太正确,因为它不再是一棵树(只是一个列表节点)
  • 也许是{}。好主意

最后,我假设在解析之后,我必须通过解析树,并将其简化为AST?还是应该把整个语法修改得不那么复杂?或者它足够抽象

AST不需要太接近语法。语法被构造成多个级别来编码优先级,并使用重复来避免左递归,同时仍然能够正确处理左关联运算符。AST不需要担心这些事情

相反,我会这样定义AST:

data Expr = BinaryOperation BinaryOperator Expr Expr
          | UnaryOperation UnaryOperator Expr
          | Literal LiteralValue
          | Variable Id
data BinaryOperator = Add | Sub | Mul | Div
data UnaryOperator = Not | Negate
expr :: Parser Expr
expr = do
  t1 <- term
  (PlusE t1 <$ symbol "+" <*> term)
    <|> pure (TermE t1)

term :: Parser Term
term = do
  f1 <- factor
  (TimesT f1 <$ symbol "*" <*> factor)
    <|> pure (FactorT f1)

factor :: Parser Factor
factor = NumberF . read <$> lexeme (many1 (satisfy isDigit))
    <|> ParenF <$> between (symbol "(") (symbol ")") expr

AST不需要太接近语法。语法被构造成多个级别来编码优先级,并使用重复来避免左递归,同时仍然能够正确处理左关联运算符。AST不需要担心这些事情

相反,我会这样定义AST:

data Expr = BinaryOperation BinaryOperator Expr Expr
          | UnaryOperation UnaryOperator Expr
          | Literal LiteralValue
          | Variable Id
data BinaryOperator = Add | Sub | Mul | Div
data UnaryOperator = Not | Negate
expr :: Parser Expr
expr = do
  t1 <- term
  (PlusE t1 <$ symbol "+" <*> term)
    <|> pure (TermE t1)

term :: Parser Term
term = do
  f1 <- factor
  (TimesT f1 <$ symbol "*" <*> factor)
    <|> pure (FactorT f1)

factor :: Parser Factor
factor = NumberF . read <$> lexeme (many1 (satisfy isDigit))
    <|> ParenF <$> between (symbol "(") (symbol ")") expr

这里还有一个可能对您有所帮助的补充答案。我不想破坏你的乐趣,所以这里有一个非常简单的语法示例:

-- Expr = Term ['+' Term]
-- Term = Factor ['*' Factor]
-- Factor = number | '(' Expr ')'
-- number = one or more digits
使用CST 作为一种方法,我们可以将该语法表示为具体语法树(CST):

用于将具体语法转换为CST的基于Parsec的解析器可能如下所示:

data Expr = BinaryOperation BinaryOperator Expr Expr
          | UnaryOperation UnaryOperator Expr
          | Literal LiteralValue
          | Variable Id
data BinaryOperator = Add | Sub | Mul | Div
data UnaryOperator = Not | Negate
expr :: Parser Expr
expr = do
  t1 <- term
  (PlusE t1 <$ symbol "+" <*> term)
    <|> pure (TermE t1)

term :: Parser Term
term = do
  f1 <- factor
  (TimesT f1 <$ symbol "*" <*> factor)
    <|> pure (FactorT f1)

factor :: Parser Factor
factor = NumberF . read <$> lexeme (many1 (satisfy isDigit))
    <|> ParenF <$> between (symbol "(") (symbol ")") expr
之后,我们可以运行:

> parseExpr "1+1*(3+4)"
PlusE (FactorT (Number 1)) (TimesT (Number 1) (ParenF (PlusE
(FactorT (Number 3)) (FactorT (Number 4)))))
>
要将其转换为以下AST:

data AExpr -- Abstract Expression
  = NumberA Int
  | PlusA AExpr AExpr
  | TimesA AExpr AExpr
我们可以写:

aexpr :: Expr -> AExpr
aexpr (TermE t) = aterm t
aexpr (PlusE t1 t2) = PlusA (aterm t1) (aterm t2)

aterm :: Term -> AExpr
aterm (FactorT f) = afactor f
aterm (TimesT f1 f2) = TimesA (afactor f1) (afactor f2)

afactor :: Factor -> AExpr
afactor (NumberF n) = NumberA n
afactor (ParenF e) = aexpr e
要解释AST,我们可以使用:

interp :: AExpr -> Int
interp (NumberA n) = n
interp (PlusA e1 e2) = interp e1 + interp e2
interp (TimesA e1 e2) = interp e1 * interp e2
然后写:

calc :: String -> Int
calc = interp . aexpr . parseExpr
之后我们有了一个粗略的小计算器:

> calc "1 + 2 * (6 + 3)"
19
>
跳过CST 作为替代方法,我们可以将解析器替换为直接将解析为
AExpr
类型的AST的解析器:

expr :: Parser AExpr
expr = do
  t1 <- term
  (PlusA t1 <$ symbol "+" <*> term)
    <|> pure t1

term :: Parser AExpr
term = do
  f1 <- factor
  (TimesA f1 <$ symbol "*" <*> factor)
    <|> pure f1

factor :: Parser AExpr
factor = NumberA . read <$> lexeme (many1 (satisfy isDigit))
    <|> between (symbol "(") (symbol ")") expr
参考程序 以下是使用中间CST的完整程序:

-- Calc1.hs, using a CST

{-# OPTIONS_GHC -Wall #-}

module Calc1 where

import Data.Char
import Text.Parsec
import Text.Parsec.String

data Expr = TermE Term | PlusE Term Term            deriving (Show)
data Term = FactorT Factor | TimesT Factor Factor   deriving (Show)
data Factor = NumberF Int | ParenF Expr             deriving (Show)

lexeme :: Parser a -> Parser a
lexeme p = p <* spaces

symbol :: String -> Parser String
symbol = lexeme . string

expr :: Parser Expr
expr = do
  t1 <- term
  (PlusE t1 <$ symbol "+" <*> term)
    <|> pure (TermE t1)

term :: Parser Term
term = do
  f1 <- factor
  (TimesT f1 <$ symbol "*" <*> factor)
    <|> pure (FactorT f1)

factor :: Parser Factor
factor = NumberF . read <$> lexeme (many1 (satisfy isDigit))
    <|> ParenF <$> between (symbol "(") (symbol ")") expr

parseExpr :: String -> Expr
parseExpr pgm = case parse (spaces *> expr) "(string)" pgm of
  Right e -> e
  Left err -> error $ show err

data AExpr -- Abstract Expression
  = NumberA Int
  | PlusA AExpr AExpr
  | TimesA AExpr AExpr

aexpr :: Expr -> AExpr
aexpr (TermE t) = aterm t
aexpr (PlusE t1 t2) = PlusA (aterm t1) (aterm t2)

aterm :: Term -> AExpr
aterm (FactorT f) = afactor f
aterm (TimesT f1 f2) = TimesA (afactor f1) (afactor f2)

afactor :: Factor -> AExpr
afactor (NumberF n) = NumberA n
afactor (ParenF e) = aexpr e

interp :: AExpr -> Int
interp (NumberA n) = n
interp (PlusA e1 e2) = interp e1 + interp e2
interp (TimesA e1 e2) = interp e1 * interp e2

calc :: String -> Int
calc = interp . aexpr . parseExpr
--Calc1.hs,使用CST
{-#选项#GHC-墙#-}
模块Calc1,其中
导入数据.Char
导入文本.Parsec
导入Text.Parsec.String
数据表达式=术语|加上术语推导(显示)
数据项=系数T系数|时间T系数推导(显示)
数据系数=NumberF Int | ParenF Expr推导(显示)
词素::解析器a->解析器a
词素p=p解析器字符串
符号=词素。一串
expr::Parser expr
expr=do
t1错误$show错误
数据AExpr——抽象表达式
=整数
|PlusA AExpr AExpr
|TimesA AExpr AExpr
aexpr::Expr->aexpr
aexpr(TermE t)=aterm t
aexpr(脉冲t1 t2)=脉冲A(aterm t1)(aterm t2)
aterm::Term->AExpr
aterm(系数f)=系数f
aterm(TimesT f1 f2)=TimesA(afactor f1)(afactor f2)
afactor::Factor->AExpr
a因子(NumberF n)=NumberA n
afactor(ParenF e)=aexpr e
interp::AExpr->Int
interp(NumberA n)=n
interp(PlusA e1 e2)=interp e1+interp e2
interp(TimesA e1 e2)=interp e1*interp e2
计算:字符串->整数
计算=interp。aexpr。parseExpr
下面是跳过显式CST表示的更传统解决方案的完整程序:

-- Calc2.hs, with direct parsing to AST

{-# OPTIONS_GHC -Wall #-}

module Calc where

import Data.Char
import Text.Parsec
import Text.Parsec.String

lexeme :: Parser a -> Parser a
lexeme p = p <* spaces

symbol :: String -> Parser String
symbol = lexeme . string

expr :: Parser AExpr
expr = do
  t1 <- term
  (PlusA t1 <$ symbol "+" <*> term)
    <|> pure t1

term :: Parser AExpr
term = do
  f1 <- factor
  (TimesA f1 <$ symbol "*" <*> factor)
    <|> pure f1

factor :: Parser AExpr
factor = NumberA . read <$> lexeme (many1 (satisfy isDigit))
    <|> between (symbol "(") (symbol ")") expr

parseExpr :: String -> AExpr
parseExpr pgm = case parse (spaces *> expr) "(string)" pgm of
  Right e -> e
  Left err -> error $ show err

data AExpr -- Abstract Expression
  = NumberA Int
  | PlusA AExpr AExpr
  | TimesA AExpr AExpr

interp :: AExpr -> Int
interp (NumberA n) = n
interp (PlusA e1 e2) = interp e1 + interp e2
interp (TimesA e1 e2) = interp e1 * interp e2

calc :: String -> Int
calc = interp . parseExpr
--Calc2.hs,直接解析为AST
{-#选项#GHC-墙#-}
模块计算在哪里
导入数据.Char
导入文本.Parsec
导入Text.Parsec.String
词素::解析器a->解析器a
词素p=p解析器字符串
符号=词素。一串
解析器AExpr
expr=do
t1错误$show错误
数据AExpr——抽象表达式
=整数
|PlusA AExpr AExpr
|TimesA AExpr AExpr
interp::AExpr->Int
interp(NumberA n)=n
interp(PlusA e1 e2)=interp e1+interp e2
interp(TimesA e1 e2)=interp e1*interp e2
计算:字符串->整数
计算=interp。parseExpr

这里有一个可能对您有所帮助的补充答案。我不想破坏你的乐趣,所以这里有一个非常简单的语法示例:

-- Expr = Term ['+' Term]
-- Term = Factor ['*' Factor]
-- Factor = number | '(' Expr ')'
-- number = one or more digits
使用CST 作为一种方法,我们可以将该语法表示为具体语法树(CST):

用于将具体语法转换为CST的基于Parsec的解析器可能如下所示:

data Expr = BinaryOperation BinaryOperator Expr Expr
          | UnaryOperation UnaryOperator Expr
          | Literal LiteralValue
          | Variable Id
data BinaryOperator = Add | Sub | Mul | Div
data UnaryOperator = Not | Negate
expr :: Parser Expr
expr = do
  t1 <- term
  (PlusE t1 <$ symbol "+" <*> term)
    <|> pure (TermE t1)

term :: Parser Term
term = do
  f1 <- factor
  (TimesT f1 <$ symbol "*" <*> factor)
    <|> pure (FactorT f1)

factor :: Parser Factor
factor = NumberF . read <$> lexeme (many1 (satisfy isDigit))
    <|> ParenF <$> between (symbol "(") (symbol ")") expr
之后,我们可以运行:

> parseExpr "1+1*(3+4)"
PlusE (FactorT (Number 1)) (TimesT (Number 1) (ParenF (PlusE
(FactorT (Number 3)) (FactorT (Number 4)))))
>
要将其转换为以下AST:

data AExpr -- Abstract Expression
  = NumberA Int
  | PlusA AExpr AExpr
  | TimesA AExpr AExpr
我们可以写:

aexpr :: Expr -> AExpr
aexpr (TermE t) = aterm t
aexpr (PlusE t1 t2) = PlusA (aterm t1) (aterm t2)

aterm :: Term -> AExpr
aterm (FactorT f) = afactor f
aterm (TimesT f1 f2) = TimesA (afactor f1) (afactor f2)

afactor :: Factor -> AExpr
afactor (NumberF n) = NumberA n
afactor (ParenF e) = aexpr e
要解释AST,我们可以使用:

interp :: AExpr -> Int
interp (NumberA n) = n
interp (PlusA e1 e2) = interp e1 + interp e2
interp (TimesA e1 e2) = interp e1 * interp e2
然后写:

calc :: String -> Int
calc = interp . aexpr . parseExpr
之后我们有了一个粗略的小计算器:

> calc "1 + 2 * (6 + 3)"
19
>
跳过CST 作为替代方法,我们可以将解析器替换为直接将解析为
AExpr
类型的AST的解析器:

expr :: Parser AExpr
expr = do
  t1 <- term
  (PlusA t1 <$ symbol "+" <*> term)
    <|> pure t1

term :: Parser AExpr
term = do
  f1 <- factor
  (TimesA f1 <$ symbol "*" <*> factor)
    <|> pure f1

factor :: Parser AExpr
factor = NumberA . read <$> lexeme (many1 (satisfy isDigit))
    <|> between (symbol "(") (symbol ")") expr
参考程序 以下是使用中间CST的完整程序:

-- Calc1.hs, using a CST

{-# OPTIONS_GHC -Wall #-}

module Calc1 where

import Data.Char
import Text.Parsec
import Text.Parsec.String

data Expr = TermE Term | PlusE Term Term            deriving (Show)
data Term = FactorT Factor | TimesT Factor Factor   deriving (Show)
data Factor = NumberF Int | ParenF Expr             deriving (Show)

lexeme :: Parser a -> Parser a
lexeme p = p <* spaces

symbol :: String -> Parser String
symbol = lexeme . string

expr :: Parser Expr
expr = do
  t1 <- term
  (PlusE t1 <$ symbol "+" <*> term)
    <|> pure (TermE t1)

term :: Parser Term
term = do
  f1 <- factor
  (TimesT f1 <$ symbol "*" <*> factor)
    <|> pure (FactorT f1)

factor :: Parser Factor
factor = NumberF . read <$> lexeme (many1 (satisfy isDigit))
    <|> ParenF <$> between (symbol "(") (symbol ")") expr

parseExpr :: String -> Expr
parseExpr pgm = case parse (spaces *> expr) "(string)" pgm of
  Right e -> e
  Left err -> error $ show err

data AExpr -- Abstract Expression
  = NumberA Int
  | PlusA AExpr AExpr
  | TimesA AExpr AExpr

aexpr :: Expr -> AExpr
aexpr (TermE t) = aterm t
aexpr (PlusE t1 t2) = PlusA (aterm t1) (aterm t2)

aterm :: Term -> AExpr
aterm (FactorT f) = afactor f
aterm (TimesT f1 f2) = TimesA (afactor f1) (afactor f2)

afactor :: Factor -> AExpr
afactor (NumberF n) = NumberA n
afactor (ParenF e) = aexpr e

interp :: AExpr -> Int
interp (NumberA n) = n
interp (PlusA e1 e2) = interp e1 + interp e2
interp (TimesA e1 e2) = interp e1 * interp e2

calc :: String -> Int
calc = interp . aexpr . parseExpr
--Calc1.hs,使用CST
{-#选项#GHC-墙#-}
模块Calc1,其中
导入数据.Char
导入文本.Parsec
导入Text.Parsec.String
数据表达式=术语|加上术语推导(显示)
数据项=系数T系数|时间T系数推导(显示)
数据系数=NumberF Int | ParenF Expr推导(显示)
词素::解析器a->解析器a
词素p=p解析器字符串
符号=词素。一串
expr::Parser expr
expr=do
t1错误$show错误
数据AExpr——抽象表达式
=整数
|PlusA AExpr AExpr
|TimesA AExpr AExpr
aexpr::Expr->aexpr
aexpr(TermE t)=aterm t
aexpr(脉冲t1 t2)=脉冲A(aterm t1)(aterm t2)
aterm::Term->AExpr
aterm(系数f)=系数f
aterm(TimesT f1 f2)=TimesA(afactor f1)(afactor f2)
afactor::Factor->AExpr
a因子(NumberF n)=NumberA n
A演员(Pa)