Wolfram mathematica Mathematica中的树数据结构_Wolfram Mathematica

Wolfram mathematica Mathematica中的树数据结构

wolfram-mathematica

Wolfram mathematica Mathematica中的树数据结构,wolfram-mathematica,Wolfram Mathematica,我使用mathematica主要是作为数学工作台和编写相对较小的特别程序。然而，我正在设计一个我打算用Mathematica编程的系统。我需要将数据存储在树中，并搜索和遍历树。虽然我知道如何实现树，但我更喜欢标准的、经过测试的代码。我在Mathematica用户维基上查看了基本数据结构的软件包。虽然Mathematica文档中有一个小例子，但我没有发现现在回答我的问题：是否有数据结构的（开源）包关于数据结构，您使用了什么方法？逐步开发自己的util包（这不是一个问题，只是一句话。也许……

我使用mathematica主要是作为数学工作台和编写相对较小的特别程序。然而，我正在设计一个我打算用Mathematica编程的系统。我需要将数据存储在树中，并搜索和遍历树。虽然我知道如何实现树，但我更喜欢标准的、经过测试的代码。我在Mathematica用户维基上查看了基本数据结构的软件包。虽然Mathematica文档中有一个小例子，但我没有发现

现在回答我的问题：

是否有数据结构的（开源）包

关于数据结构，您使用了什么方法？逐步开发自己的util包

（这不是一个问题，只是一句话。也许……缺乏（大量可用的）开源软件包是Mathematica没有应有动力的原因。恐怕这是一个鸡/蛋的问题。）

在Mathematica中，您所做的大部分工作都是基于表达式的。表达式自然具有树结构。对于深度优先遍历（这可能是最常见的），然后可以使用诸如

Scan

、

Map

、

Cases

等函数。与更传统的语言不同的是，没有简单的方法来保存表达式树中单个节点的标识，因为Mathematica中没有指针。此外，对Mathematica中惯用的表达式执行的许多操作都会复制整个表达式，因为表达式是不可变的，所以您只需要在少数地方对其进行修改

使用不变的Mathematica表达式作为树仍然有几个优点。一个是，因为它们是不可变的，所以通过查看它们很容易理解它们存储了什么（状态和行为不是混合的）。另一个原因是，有一些高效的通用函数，如

Map

、

mapindex

或

Scan

，可以遍历它们。例如，访问者设计模式是-它只是

Map[f，tree，Infinity]

，内置在langauge中。此外，还有一些内置函数，如

Cases

、

Replace

、

ReplaceAll

等，可以编写非常简洁的声明性代码来分解树，找到具有特定语法或满足某些条件的树段，由于树不限于从列表中构建，也不限于从不同的头中构建，因此可以有效地使用它来编写非常简洁的树处理代码。最后，您可以非常轻松地构建任意树结构，从而更轻松地执行实验和原型，从而缩短开发周期并最终实现更好的设计

也就是说，您当然可以实现“有状态”（可变）树数据结构。我怀疑，还没有完成这项工作的真正原因通常是，与构建、修改和遍历这样一棵树相关的性能损失，因为它在每一步都会经历一个完整的符号评估过程（有关更多细节，请参阅帖子）。例如，有关如何在Mathematica上下文中使用二叉搜索树以获得高效代码的两个示例，请参见我的文章（通用符号设置）和（在编译代码上下文中）。对于在Mathematica中惯用的构造数据结构的一般方法，我推荐Roman Maeder的书：“Mathematica中的编程”、“Mathematica程序员I&II”，尤其是“Mathematica中的计算机科学”。在后者中，他详细讨论了如何在Mathematica中实现二叉搜索树。正如@Simon提到的，@Daniel Lichtblau的演讲也是一个很好的资源，它展示了如何构建数据结构并使其高效

关于Mathematica中实现数据结构的一般方法，其中包含一些状态，下面是一个简单的示例，摘自我在Mathgroup thread中的文章——它实现了一个“pair”数据结构

Unprotect[pair, setFirst, getFirst, setSecond, getSecond, new, delete];
ClearAll[pair, setFirst, getFirst, setSecond, getSecond, new, delete];
Module[{first, second},
  first[_] := {};
  second[_] := {};
  pair /: new[pair[]] := pair[Unique[]];
  pair /: pair[tag_].delete[] := (first[tag] =.; second[tag] =.);
  pair /: pair[tag_].setFirst[value_] := first[tag] = value;
  pair /: pair[tag_].getFirst[] := first[tag];
  pair /: pair[tag_].setSecond[value_] := second[tag] = value;
  pair /: pair[tag_].getSecond[] := second[tag];
  Format[pair[x_Symbol]] := "pair[" <> ToString[Hash[x]] <> "]";
];
Protect[pair, setFirst, getFirst, setSecond, getSecond, new, delete];

创建新对象对的列表：

pairs = Table[new[pair[]], {10}]

{"pair[430427975]", "pair[430428059]", "pair[430428060]", "pair[430428057]",
"pair[430428058]", "pair[430428063]", "pair[430428064]", "pair[430428061]", 
"pair[430428062]", "pair[430428051]"}

设置字段：

Module[{i},
 For[i = 1, i <= 10, i++,
  pairs[[i]].setFirst[10*i];
  pairs[[i]].setSecond[20*i];]]

#.getFirst[] & /@ pairs

{10, 20, 30, 40, 50, 60, 70, 80, 90, 100}

#.getSecond[] & /@ pairs

{20, 40, 60, 80, 100, 120, 140, 160, 180, 200}

在我提到的帖子中，有一个更详细的讨论。以这种方式创建的“对象”的一个大问题是它们没有自动垃圾收集，这可能是在顶级Mathematica中实现的OOP扩展没有真正实现的主要原因之一

Mathematica有几个OOP扩展，例如Roman Maeder的

classes.m

包（源代码在他的“Mathematica程序员”一书中）、

Objectica

商业包以及其他几个包。但是，除非Mathematica本身能够提供高效的机制（可能基于某种指针或引用机制）来构建可变数据结构（如果发生这种情况），否则mma中此类数据结构的顶级实现可能会带来巨大的性能损失。此外，由于mma的核心思想之一是基于不变性，因此要使可变数据结构与Mathematica编程的其他习惯用法很好地匹配并不容易

编辑

下面是一个基本的有状态树实现，与上面的示例类似：

Module[{parent, children, value},
  children[_] := {};
  value[_] := Null;
  node /: new[node[]] := node[Unique[]];
  node /: node[tag_].getChildren[] := children[tag];
  node /: node[tag_].addChild[child_node, index_] := 
        children[tag] = Insert[children[tag], child, index];
  node /: node[tag_].removeChild[index_] := 
        children[tag] = Delete[children[tag], index];
  node /: node[tag_].getChild[index_] := children[tag][[index]];
  node /: node[tag_].getValue[] := value[tag];
  node /: node[tag_].setValue[val_] := value[tag] = val;
];

一些使用示例：

In[68]:= root = new[node[]]

Out[68]= node[$7]

In[69]:= root.addChild[new[node[]], 1]

Out[69]= {node[$8]}

In[70]:= root.addChild[new[node[]], 2]

Out[70]= {node[$8], node[$9]}

In[71]:= root.getChild[1].addChild[new[node[]], 1]

Out[71]= {node[$10]}

In[72]:= root.getChild[1].getChild[1].setValue[10]

Out[72]= 10

In[73]:= root.getChild[1].getChild[1].getValue[]

Out[73]= 10

有关使用这种可变树数据结构的一个非常重要的示例，请参阅我的文章。它还将这种方法和大量重用Mathematica原生数据结构和函数的方法相对抗，并很好地说明了本文开头讨论的要点

我使用mathematica主要是作为数学工作台和编写相对较小的特别程序

Mathematica在这方面非常出色

关于数据结构，您使用了什么方法？逐步开发自己的util包

我避免在Mathematica中创建自己的数据结构，因为它不能有效地处理它们。具体而言，一般数据结构

In[68]:= root = new[node[]]

Out[68]= node[$7]

In[69]:= root.addChild[new[node[]], 1]

Out[69]= {node[$8]}

In[70]:= root.addChild[new[node[]], 2]

Out[70]= {node[$8], node[$9]}

In[71]:= root.getChild[1].addChild[new[node[]], 1]

Out[71]= {node[$10]}

In[72]:= root.getChild[1].getChild[1].setValue[10]

Out[72]= 10

In[73]:= root.getChild[1].getChild[1].getValue[]

Out[73]= 10

safe[{x0_, y0_}][{x1_, y1_}] := 
 x0 != x1 && y0 != y1 && x0 - y0 != x1 - y1 && x0 + y0 != x1 + y1

filter[_, {}] := {}
filter[p_, {h_, t_}] := If[p[h], {h, filter[p, t]}, filter[p, t]]

search[n_, nqs_, qs_, {}, a_] := If[nqs == n, a + 1, a]
search[n_, nqs_, qs_, {q_, ps_}, a_] := 
 search[n, nqs, qs, ps, 
  search[n, nqs + 1, {q, qs}, filter[safe[q], ps], a]]

ps[n_] := 
 Fold[{#2, #1} &, {}, Flatten[Table[{i, j}, {i, n}, {j, n}], 1]]

solve[n_] := search[n, 0, {}, ps[n], 0]

let safe (x0, y0) (x1, y1) =
  x0<>x1 && y0<>y1 && x0-y0<>x1-y1 && x0+y0<>x1+y1

let rec filter f = function
  | [] -> []
  | x::xs -> if f x then x::filter f xs else filter f xs

let rec search n nqs qs ps a =
  match ps with
  | [] -> if nqs=n then a+1 else a
  | q::ps ->
      search n (nqs+1) (q::qs) (filter (safe q) ps) a
      |> search n nqs qs ps

let ps n =
  [ for i in 1..n do
      for j in 1..n do
        yield i, j ]

let solve n = search n 0 [] (ps n) 0

solve 8

let rand = System.Random()
let xs = List.init 10000 (fun _ -> rand.Next 100)
Array.init 100 (fun _ ->
  let t = System.Diagnostics.Stopwatch.StartNew()
  ignore(List.length xs)
  t.Elapsed.TotalSeconds)