Haskell 是否有一种更易于维护的方法来处理我的数据类型?

Haskell 是否有一种更易于维护的方法来处理我的数据类型?,haskell,compiler-construction,pattern-matching,recursive-descent,Haskell,Compiler Construction,Pattern Matching,Recursive Descent,我使用以下数据类型定义了递归下降解析器的产品: data CST = Program CST CST | Block CST CST CST | StatementList CST CST | EmptyStatementList | Statement CST | PrintStatement CST CST CST CST | AssignmentStatement CST CST CST | VarDecl CST CST

我使用以下数据类型定义了递归下降解析器的产品:

data CST 
    = Program CST CST
    | Block CST CST CST 
    | StatementList CST CST
    | EmptyStatementList
    | Statement CST
    | PrintStatement CST CST CST CST
    | AssignmentStatement CST CST CST
    | VarDecl CST CST
    | WhileStatement CST CST CST 
    | IfStatement CST CST CST 
    | Expr CST
    | IntExpr1 CST CST CST 
    | IntExpr2 CST
    | StringExpr CST CST CST
    | BooleanExpr1 CST CST CST CST CST
    | BooleanExpr2 CST 
    | Id CST
    | CharList CST CST 
    | EmptyCharList
    | Type CST 
    | Character CST
    | Space CST
    | Digit CST
    | BoolOp CST
    | BoolVal CST
    | IntOp CST
    | TermComponent Token
    | ErrorTermComponent (Token, Int)
    | NoInput
正如数据类型名称所示,数据类型构造了一个具体的语法树。我想知道是否有比这种类型更易于维护的模式匹配方法。例如,要跟踪解析调用的执行,我有以下步骤:

checkAndPrintParse :: CST -> IO ()
checkAndPrintParse (Program c1 c2) = do
    putStrLn "Parser: parseProgram" 
    checkAndPrintParse c1
    checkAndPrintParse c2
checkAndPrintParse (Block c1 c2 c3) = do
    putStrLn "Parser: parseBlock"
    checkAndPrintParse c1
    checkAndPrintParse c2
    checkAndPrintParse c3
checkAndPrintParse (StatementList c1 c2) = do
    putStrLn "Parser: parseStatementList"
    checkAndPrintParse c1
    checkAndPrintParse c2

等等。我已经研究了修复函数/模式,但我不确定它是否适用于这里。

使用泛型派生来获取构造函数的名称:

从GHC派生泛型。泛型 从泛型调用conNameOf::CSTF->String.Deriving 使用递归方案遍历递归类型:

使用派生递归类型的基函子。CST的基函子称为CSTF,是一个参数化类型,其形状与CST相同,但CST的递归出现被替换为类型参数。 开始学习使用它可能有点费心。在这种情况下,我们希望从CST递归构造IO操作,即函数CST->IO。为此,类型变为CSTF IO->IO->CST->IO,带有t~CST和a~IO,其中第一个参数定义生成的递归函数体,递归调用的结果放在基函子的字段中。 因此,如果您的目标是编写一个递归函数checkAndPrintParse,使用如下一种情况:

checkAndPrintParse (Program c1 c2) = do
  putStrLn "Parser: parseProgram" 
  checkAndPrintParse c1
  checkAndPrintParse c2
cata将把递归调用的结果放在c1和c2上,而不是放在这些字段上:

-- goal: find f such that   cata f = checkAndPrintParse

-- By definition of cata
cata f (Program c1 c2) = f (ProgramF (cata f c1) (cata f c2))

-- By the goal and the definition of checkAndPrintParse
cata f (Program c1 c2) = checkAndPrintParse (Program c1 c2) = do
  putStrLn "Parser: parseProgram" 
  checkAndPrintParse c1
  checkAndPrintParse c2
所以

f (ProgramF (cata f c1) (cata f c2)) = do
  putStrLn "Parser: parseProgram"
  cata f c1
  cata f c2
抽象类别c1和类别c2

识别可折叠意义上的折叠

再概括

f t = do
  putStrLn $ "Parser: " ++ conNameOf t  -- Prints "ProgramF" instead of "parserProgram"... *shrugs*
  sequence_ t
这就是我们给cata的理由

输出:

Parser: ProgramF
Parser: BlockF
Parser: NoInputF
Parser: NoInputF
Parser: NoInputF
Parser: IdF
Parser: NoInputF

您应该将函数重构为生成字符串的函数,而不是执行IO。谢谢我的直觉是你的语法看起来有点单调。数字和while语句不自然地集中在一起。我希望将其分解为多种类型,以更紧密地反映语言的结构。
f t = do
  putStrLn $ "Parser: " ++ conNameOf t  -- Prints "ProgramF" instead of "parserProgram"... *shrugs*
  sequence_ t
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE TemplateHaskell #-}

import GHC.Generics
import Generics.Deriving (conNameOf)
import Data.Functor.Foldable
import Data.Functor.Foldable.TH (makeBaseFunctor)

data CST 
    = Program CST CST
    | Block CST CST CST 
    | StatementList CST CST
    | EmptyStatementList
    | Statement CST
    | PrintStatement CST CST CST CST
    | AssignmentStatement CST CST CST
    | VarDecl CST CST
    | WhileStatement CST CST CST 
    | IfStatement CST CST CST 
    | Expr CST
    | IntExpr1 CST CST CST 
    | IntExpr2 CST
    | StringExpr CST CST CST
    | BooleanExpr1 CST CST CST CST CST
    | BooleanExpr2 CST 
    | Id CST
    | CharList CST CST 
    | EmptyCharList
    | Type CST 
    | Character CST
    | Space CST
    | Digit CST
    | BoolOp CST
    | BoolVal CST
    | IntOp CST
    | TermComponent Token
    | ErrorTermComponent (Token, Int)
    | NoInput
    deriving Generic

data Token = Token

makeBaseFunctor ''CST

deriving instance Generic (CSTF a)

checkAndPrintParse :: CST -> IO ()
checkAndPrintParse = cata $ \t -> do
  putStrLn $ "Parser: " ++ conNameOf t
  sequence_ t

main = checkAndPrintParse $
  Program (Block NoInput NoInput NoInput) (Id NoInput)
Parser: ProgramF
Parser: BlockF
Parser: NoInputF
Parser: NoInputF
Parser: NoInputF
Parser: IdF
Parser: NoInputF