Haskell 分支/树数据结构的嵌套序列

Haskell 分支/树数据结构的嵌套序列,haskell,clojure,f#,ocaml,Haskell,Clojure,F#,Ocaml,我不确定这是否是一个容易解决的问题,我只是错过了一些明显的东西,但我已经用我的头撞了一段时间。我试图用列表来表达树的分歧。这样,我就可以使用简单的原语轻松地内联指定数据集,而不必担心顺序问题,并在以后从一组分散的列表中构建树 我有一些类似的清单: a = ["foo", "bar", "qux"] b = ["foo", "bar", "baz"] c = ["qux", "bar", "qux"] 我想有一个函数,它将获取这些列表的序列,并表示如下树: myfunc :: [[a]]

我不确定这是否是一个容易解决的问题,我只是错过了一些明显的东西,但我已经用我的头撞了一段时间。我试图用列表来表达树的分歧。这样,我就可以使用简单的原语轻松地内联指定数据集,而不必担心顺序问题,并在以后从一组分散的列表中构建树

我有一些类似的清单:

 a = ["foo", "bar", "qux"]
 b = ["foo", "bar", "baz"]
 c = ["qux", "bar", "qux"]
我想有一个函数,它将获取这些列表的序列,并表示如下树:

myfunc :: [[a]] -> MyTree a

(root) -> foo -> bar -> [baz, qux]
       -> qux -> bar -> qux
理想的解决方案是能够采用不同长度的序列,即:

a = ["foo"; "bar"; "qux"]
b = ["foo"; "bar"; "baz"; "quux"]
== 
(root) -> foo -> bar -> [qux, baz -> quux]
有没有教科书上的例子或算法可以帮我解决这个问题?看起来它可以优雅地解决,但我所有的刺看起来绝对可怕

请随时发布任何函数式语言的解决方案,我将根据需要进行翻译


谢谢

我解决这个问题的方法是使用一个
来表示您的类型,然后制作一个
a
幺半群
,其中
mappend
将两个
连接在一起,将它们的共同祖先连接在一起。剩下的只是一个合适的
Show
实例:

import Data.List (sort, groupBy)
import Data.Ord (comparing)
import Data.Foldable (foldMap)
import Data.Function (on)
import Data.Monoid

data Tree a = Node
    { value :: a
    , children :: Forest a
    } deriving (Eq, Ord)

instance (Show a) => Show (Tree a) where
    show (Node a f@(Forest ts0)) = case ts0 of
        []  -> show a
        [t] -> show a ++ " -> " ++ show t
        _   -> show a ++ " -> " ++ show f

data Forest a = Forest [Tree a] deriving (Eq, Ord)

instance (Show a) => Show (Forest a) where
    show (Forest ts0) = case ts0 of
        []  -> "[]"
        [t] -> show t
        ts  -> show ts

instance (Ord a) => Monoid (Forest a) where
    mempty = Forest []
    mappend (Forest tsL) (Forest tsR) =
          Forest
        . map (\ts -> Node (value $ head ts) (foldMap children ts))
        . groupBy ((==) `on` value)
        . sort
        $ tsL ++ tsR

fromList :: [a] -> Forest a
fromList = foldr cons nil
  where
    cons a as = Forest [Node a as]
    nil = Forest []
下面是一些示例用法:

>>> let a = fromList ["foo", "bar", "qux"]
>>> let b = fromList ["foo", "bar", "baz", "quux"]
>>> a
"foo" -> "bar" -> "qux"
>>> b
"foo" -> "bar" -> "baz" -> "quux"
>>> a <> b
"foo" -> "bar" -> ["baz" -> "quux","qux"]
>>> a <> a
"foo" -> "bar" -> "qux"
user> (merge-to-tree ["foo" "bar" "qux"])
{"foo" {"bar" {"qux" {}}}}

user> (merge-to-tree ["foo" "bar" "qux"] ["foo" "bar" "baz"] ["qux" "bar" "qux"])
{"foo" {"bar" {"qux" {}, "baz" {}}}, "qux" {"bar" {"qux" {}}}}

user> (merge-to-tree ["foo" "bar" "qux"] ["foo" "bar" "baz" "quux"])
{"foo" {"bar" {"qux" {}, "baz" {"quux" {}}}}}

我提出了一个与Gabriel非常相似的解决方案,但我的数据表示使用了
映射
,因此我可以将大部分工作加载到
data.Map.unionWith

import Data.Map (Map, empty, singleton, unionWith, assocs)
import Data.Monoid

type Path a = [a]
data Tree a = Tree {leaf :: Bool, childs :: Map a (Tree a)} deriving Show
树中的布尔标志标记此节点是否可以作为路径的终点。
a
值隐藏在
childs
映射中。为了热身,让我们定义如何将单个路径转换为树

root :: Tree a
root = Tree True empty

cons :: a -> Tree a -> Tree a
cons node tree = Tree False (singleton node tree)

follow :: Path a -> Tree a
follow = foldr cons root
在Gabriel的代码中,
follow
函数被称为
fromList
。我们还可以枚举树中包含的所有路径

paths :: Tree a -> [Path a]
paths (Tree leaf childs) =
  (if leaf then [[]] else []) ++
  [ node : path | (node, tree) <- assocs childs, path <- paths tree ]
现在要将路径列表转换为树,我们只需使用
mconcat
follow

unpaths :: Ord a => [Path a] -> Tree a
unpaths = mconcat . map follow
下面是一个使用问题路径的测试用例

a, b, c, d :: Path String

a = ["foo", "bar", "qux"]
b = ["foo", "bar", "baz"]
c = ["qux", "bar", "qux"]
d = ["foo", "bar", "baz", "quux"]

-- test is True
test = (paths . unpaths) [a, b, c, d] == [b, d, a, c]

我们得到的路径与存储在树中的路径相同,但是是一个有序列表。

键入TreeNode=TreeNode一个clojure版本,使用hashmaps:

(defn merge-to-tree
  [& vecs]
  (let [layer (group-by first vecs)]
    (into {} (map (fn [[k v]]
                    (when k
                      [k (apply merge-to-tree (map rest v))]))
                  layer))))
在这里,我使用GROUPBY查看多个向量元素何时应该由输出结构中的单个项表示
(into{}(map(fn[[k v]]…)m))
是一种标准的习惯用法,用于分解散列项,执行一些操作,然后根据结果重建散列。对值的递归调用
(apply merge to tree(map rest v))
在树结构的该层下构造各种分支(map rest,因为完整输入由group by保留,并且第一个元素已经用作查找键)

我欢迎其他建议/改进。用法示例:

>>> let a = fromList ["foo", "bar", "qux"]
>>> let b = fromList ["foo", "bar", "baz", "quux"]
>>> a
"foo" -> "bar" -> "qux"
>>> b
"foo" -> "bar" -> "baz" -> "quux"
>>> a <> b
"foo" -> "bar" -> ["baz" -> "quux","qux"]
>>> a <> a
"foo" -> "bar" -> "qux"
user> (merge-to-tree ["foo" "bar" "qux"])
{"foo" {"bar" {"qux" {}}}}

user> (merge-to-tree ["foo" "bar" "qux"] ["foo" "bar" "baz"] ["qux" "bar" "qux"])
{"foo" {"bar" {"qux" {}, "baz" {}}}, "qux" {"bar" {"qux" {}}}}

user> (merge-to-tree ["foo" "bar" "qux"] ["foo" "bar" "baz" "quux"])
{"foo" {"bar" {"qux" {}, "baz" {"quux" {}}}}}

谢谢@gabriel gonzalez一个非常好的回答,非常优雅。我不得不说这个解决方案也非常棒。谢谢如果是leaf,则它不应该有任何子级,这可以使用sum类型来表示:
数据树a=leaf |子级映射a(树a)派生Show
@Ankur,如果某些路径提前结束,则该子级映射将不起作用。例如,在
unpaths[[1],[1,2]]
中,标记为
1
的节点既是一个叶(对于第一条路径)又有子节点(对于第二条路径)。因此,这个数据类型不是真正的树,更像是一个无限状态自动机