Haskell 是否有一种更易于维护的方法来处理我的数据类型?
我使用以下数据类型定义了递归下降解析器的产品:Haskell 是否有一种更易于维护的方法来处理我的数据类型?,haskell,compiler-construction,pattern-matching,recursive-descent,Haskell,Compiler Construction,Pattern Matching,Recursive Descent,我使用以下数据类型定义了递归下降解析器的产品: data CST = Program CST CST | Block CST CST CST | StatementList CST CST | EmptyStatementList | Statement CST | PrintStatement CST CST CST CST | AssignmentStatement CST CST CST | VarDecl CST CST
data CST
= Program CST CST
| Block CST CST CST
| StatementList CST CST
| EmptyStatementList
| Statement CST
| PrintStatement CST CST CST CST
| AssignmentStatement CST CST CST
| VarDecl CST CST
| WhileStatement CST CST CST
| IfStatement CST CST CST
| Expr CST
| IntExpr1 CST CST CST
| IntExpr2 CST
| StringExpr CST CST CST
| BooleanExpr1 CST CST CST CST CST
| BooleanExpr2 CST
| Id CST
| CharList CST CST
| EmptyCharList
| Type CST
| Character CST
| Space CST
| Digit CST
| BoolOp CST
| BoolVal CST
| IntOp CST
| TermComponent Token
| ErrorTermComponent (Token, Int)
| NoInput
正如数据类型名称所示,数据类型构造了一个具体的语法树。我想知道是否有比这种类型更易于维护的模式匹配方法。例如,要跟踪解析调用的执行,我有以下步骤:
checkAndPrintParse :: CST -> IO ()
checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
checkAndPrintParse (Block c1 c2 c3) = do
putStrLn "Parser: parseBlock"
checkAndPrintParse c1
checkAndPrintParse c2
checkAndPrintParse c3
checkAndPrintParse (StatementList c1 c2) = do
putStrLn "Parser: parseStatementList"
checkAndPrintParse c1
checkAndPrintParse c2
等等。我已经研究了修复函数/模式,但我不确定它是否适用于这里。使用泛型派生来获取构造函数的名称: 从GHC派生泛型。泛型 从泛型调用conNameOf::CSTF->String.Deriving 使用递归方案遍历递归类型: 使用派生递归类型的基函子。CST的基函子称为CSTF,是一个参数化类型,其形状与CST相同,但CST的递归出现被替换为类型参数。 开始学习使用它可能有点费心。在这种情况下,我们希望从CST递归构造IO操作,即函数CST->IO。为此,类型变为CSTF IO->IO->CST->IO,带有t~CST和a~IO,其中第一个参数定义生成的递归函数体,递归调用的结果放在基函子的字段中。 因此,如果您的目标是编写一个递归函数checkAndPrintParse,使用如下一种情况:
checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
cata将把递归调用的结果放在c1和c2上,而不是放在这些字段上:
-- goal: find f such that cata f = checkAndPrintParse
-- By definition of cata
cata f (Program c1 c2) = f (ProgramF (cata f c1) (cata f c2))
-- By the goal and the definition of checkAndPrintParse
cata f (Program c1 c2) = checkAndPrintParse (Program c1 c2) = do
putStrLn "Parser: parseProgram"
checkAndPrintParse c1
checkAndPrintParse c2
所以
f (ProgramF (cata f c1) (cata f c2)) = do
putStrLn "Parser: parseProgram"
cata f c1
cata f c2
抽象类别c1和类别c2
识别可折叠意义上的折叠
再概括
f t = do
putStrLn $ "Parser: " ++ conNameOf t -- Prints "ProgramF" instead of "parserProgram"... *shrugs*
sequence_ t
这就是我们给cata的理由
输出:
Parser: ProgramF
Parser: BlockF
Parser: NoInputF
Parser: NoInputF
Parser: NoInputF
Parser: IdF
Parser: NoInputF
您应该将函数重构为生成字符串的函数,而不是执行IO。谢谢我的直觉是你的语法看起来有点单调。数字和while语句不自然地集中在一起。我希望将其分解为多种类型,以更紧密地反映语言的结构。
f t = do
putStrLn $ "Parser: " ++ conNameOf t -- Prints "ProgramF" instead of "parserProgram"... *shrugs*
sequence_ t
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}
import GHC.Generics
import Generics.Deriving (conNameOf)
import Data.Functor.Foldable
import Data.Functor.Foldable.TH (makeBaseFunctor)
data CST
= Program CST CST
| Block CST CST CST
| StatementList CST CST
| EmptyStatementList
| Statement CST
| PrintStatement CST CST CST CST
| AssignmentStatement CST CST CST
| VarDecl CST CST
| WhileStatement CST CST CST
| IfStatement CST CST CST
| Expr CST
| IntExpr1 CST CST CST
| IntExpr2 CST
| StringExpr CST CST CST
| BooleanExpr1 CST CST CST CST CST
| BooleanExpr2 CST
| Id CST
| CharList CST CST
| EmptyCharList
| Type CST
| Character CST
| Space CST
| Digit CST
| BoolOp CST
| BoolVal CST
| IntOp CST
| TermComponent Token
| ErrorTermComponent (Token, Int)
| NoInput
deriving Generic
data Token = Token
makeBaseFunctor ''CST
deriving instance Generic (CSTF a)
checkAndPrintParse :: CST -> IO ()
checkAndPrintParse = cata $ \t -> do
putStrLn $ "Parser: " ++ conNameOf t
sequence_ t
main = checkAndPrintParse $
Program (Block NoInput NoInput NoInput) (Id NoInput)
Parser: ProgramF
Parser: BlockF
Parser: NoInputF
Parser: NoInputF
Parser: NoInputF
Parser: IdF
Parser: NoInputF