在Haskell中实现左关联树的“read”

在Haskell中实现左关联树的“read”,haskell,Haskell,我很难实现树结构。我想取一个左关联字符串(带paren),比如ABC(DE)F,并将其转换为树。该特定示例对应于树 以下是我正在使用的数据类型(尽管我愿意接受建议): 在哈斯凯尔,这棵树应该是: example = Branch (Branch (Branch (Branch (Leaf 'A') (Leaf 'B')) (Leaf 'C'))

我很难实现树结构。我想取一个左关联字符串(带paren),比如
ABC(DE)F
,并将其转换为树。该特定示例对应于树

以下是我正在使用的数据类型(尽管我愿意接受建议):

在哈斯凯尔,这棵树应该是:

example = Branch (Branch (Branch (Branch (Leaf 'A')
                                         (Leaf 'B'))
                                 (Leaf 'C'))
                         (Branch (Leaf 'D')
                                 (Leaf 'E')))
                 (Leaf 'F')
我的
show
函数如下所示:

instance Show Tree where
    show (Branch l r@(Branch _ _)) = show l ++ "(" ++ show r ++ ")"
    show (Branch l r) = show l ++ show r
    show (Leaf x) = [x]
我想做一个
read
函数,以便

read "ABC(DE)F" == example

这看起来非常像一个堆栈结构。当您遇到输入字符串“ABC(DE)F”时,您可以将找到的任何原子(非括号)放入累加器列表中。当列表中有两个项目时,可以将它们分支到一起。这可以通过以下方式完成(注意,未经测试,仅包括给出想法):


这可能需要一些修改,但我认为这足以让您走上正确的道路。

在这种情况下,使用解析库可以使代码非常简短,并且非常有表现力。(当我尝试回答这个问题时,我惊讶于它是如此整洁!)

我将使用(那篇文章提供了一些链接以获取更多信息),并在“应用模式”(而不是monadic)中使用它,因为我们不需要monads额外的力量/脚射击能力

代码 首先是各种进口和定义:

import Text.Parsec

import Control.Applicative ((<*), (<$>))

data Tree = Branch Tree Tree | Leaf Char deriving (Eq, Show)

paren, tree, unit :: Parsec String st Tree
(是的,这就是整个解析器!)

如果我们愿意,我们可以不使用
paren
unit
,但是上面的代码非常有表现力,所以我们可以保持原样

作为简要说明(我提供了文档的链接):

  • 基本上是指“左解析器或右解析器”
  • 允许您生成更好的错误消息
  • 将解析不在给定字符列表中的任何内容
  • 获取三个解析器,并返回第三个解析器的值,只要它由第一个和第二个解析器分隔
  • 逐字分析其论点
  • 将其一个或多个参数解析为列表(空字符串似乎无效,因此
    many1
    ,而不是解析零个或多个参数的
    many
  • 匹配输入的结尾
我们可以使用该函数运行解析器(它返回
ParseError Tree
Left
是错误,
Right
是正确的解析)

作为
读取
将其用作类似于读取的函数可能类似于:

read' str = case parse onlyTree "" str of
   Right tr -> tr
   Left er -> error (show er)
(我使用了
read'
,以避免与
Prelude.read
冲突;如果您想要
read
实例,您需要做更多的工作来实现
readPrec
(或任何需要的东西),但实际解析已经完成,应该不会太难。)

例子 一些基本例子:

*Tree> read' "A"
Leaf 'A'

*Tree> read' "AB"
Branch (Leaf 'A') (Leaf 'B')

*Tree> read' "ABC"
Branch (Branch (Leaf 'A') (Leaf 'B')) (Leaf 'C')

*Tree> read' "A(BC)"
Branch (Leaf 'A') (Branch (Leaf 'B') (Leaf 'C'))

*Tree> read' "ABC(DE)F" == example
True

*Tree> read' "ABC(DEF)" == example
False

*Tree> read' "ABCDEF" == example
False
演示错误:

*Tree> read' ""
***Exception: (line 1, column 1):
unexpected end of input
expecting group or literal

*Tree> read' "A(B"
***Exception: (line 1, column 4):
unexpected end of input
expecting group or literal or ")"
最后是
onlyTree
之间的区别:

*Tree> parse tree "" "AB)CD"     -- success: ignores ")CD"
Right (Branch (Leaf 'A') (Leaf 'B'))

*Tree> parse onlyTree "" "AB)CD" -- fail: can't parse the ")"
Left (line 1, column 3):
unexpected ')'
expecting group or literal or end of input
结论
帕塞克太棒了!这个答案可能很长,但它的核心只是完成所有工作的5或6行代码。

dbaupp的parsec答案很容易理解。作为“低级”方法的一个示例,下面是一个手写解析器,它使用成功延续来处理左关联树构建:

instance Read Tree where readsPrec _prec s = maybeToList (readTree s)

type TreeCont = (Tree,String) -> Maybe (Tree,String)

readTree :: String -> Maybe (Tree,String)
readTree = read'top Just where
  valid ')' = False
  valid '(' = False
  valid _ = True

  read'top :: TreeCont -> String -> Maybe (Tree,String)
  read'top acc s@(x:ys) | valid x =
    case ys of
      [] -> acc (Leaf x,[])
      (y:zs) -> read'branch acc s
  read'top _ _ = Nothing

  -- The next three are mutually recursive

  read'branch :: TreeCont -> String -> Maybe (Tree,String)
  read'branch acc (x:y:zs) | valid x = read'right (combine (Leaf x) >=> acc) y zs
  read'branch _ _ = Nothing

  read'right :: TreeCont -> Char -> String -> Maybe (Tree,String)
  read'right acc y ys | valid y = acc (Leaf y,ys)
  read'right acc '(' ys = read'branch (drop'close >=> acc) ys
     where drop'close (b,')':zs) = Just (b,zs)
           drop'close _ = Nothing
  read'right _ _ _ = Nothing  -- assert y==')' here

  combine :: Tree -> TreeCont
  combine build (t, []) = Just (Branch build t,"")
  combine build (t, ys@(')':_)) = Just (Branch build t,ys)  -- stop when lookahead shows ')'
  combine build (t, y:zs) = read'right (combine (Branch build t)) y zs

上面的parsec可以工作,但也接受像“(AB)C(DE)F”这样的字符串,这些字符串永远不会来自给定的show实例。在本例中,read'example==read'(AB)C(DE)F“为真。@ChrisKuklewicz,这不是正确的行为吗?括号只是用来分组的;关联性意味着有时它们是冗余的,但它们仍然被允许存在;在Haskell 9报告中,我能找到的唯一一件事是“showsPrec生成的字符串通常可以被readsPrec读取。”我应该写一篇更长的评论,以明确我并没有说您的解析器不正确。
*Tree> read' ""
***Exception: (line 1, column 1):
unexpected end of input
expecting group or literal

*Tree> read' "A(B"
***Exception: (line 1, column 4):
unexpected end of input
expecting group or literal or ")"
*Tree> parse tree "" "AB)CD"     -- success: ignores ")CD"
Right (Branch (Leaf 'A') (Leaf 'B'))

*Tree> parse onlyTree "" "AB)CD" -- fail: can't parse the ")"
Left (line 1, column 3):
unexpected ')'
expecting group or literal or end of input
instance Read Tree where readsPrec _prec s = maybeToList (readTree s)

type TreeCont = (Tree,String) -> Maybe (Tree,String)

readTree :: String -> Maybe (Tree,String)
readTree = read'top Just where
  valid ')' = False
  valid '(' = False
  valid _ = True

  read'top :: TreeCont -> String -> Maybe (Tree,String)
  read'top acc s@(x:ys) | valid x =
    case ys of
      [] -> acc (Leaf x,[])
      (y:zs) -> read'branch acc s
  read'top _ _ = Nothing

  -- The next three are mutually recursive

  read'branch :: TreeCont -> String -> Maybe (Tree,String)
  read'branch acc (x:y:zs) | valid x = read'right (combine (Leaf x) >=> acc) y zs
  read'branch _ _ = Nothing

  read'right :: TreeCont -> Char -> String -> Maybe (Tree,String)
  read'right acc y ys | valid y = acc (Leaf y,ys)
  read'right acc '(' ys = read'branch (drop'close >=> acc) ys
     where drop'close (b,')':zs) = Just (b,zs)
           drop'close _ = Nothing
  read'right _ _ _ = Nothing  -- assert y==')' here

  combine :: Tree -> TreeCont
  combine build (t, []) = Just (Branch build t,"")
  combine build (t, ys@(')':_)) = Just (Branch build t,ys)  -- stop when lookahead shows ')'
  combine build (t, y:zs) = read'right (combine (Branch build t)) y zs