Python 分层排序数据_Python_List_Sorting_Set_Hierarchical Data

Python 分层排序数据

python list sorting

Python 分层排序数据,python,list,sorting,set,hierarchical-data,Python,List,Sorting,Set,Hierarchical Data,我的python程序返回一个包含子列表数据的列表。每个子列表包含项目的唯一id和该项目的父id，即 pages_id_list ={ {22, 4},{45,1},{1,1}, {4,4},{566,45},{7,7},{783,566}, {66,1},{300,8},{8,4},{101,7},{80,22}, {17,17},{911,66} } 在每个子列表中，数据的结构如下{*article\u id*，*parent\u id*} 如果article\u id和parent\u i

我的python程序返回一个包含子列表数据的列表。每个子列表包含项目的唯一id和该项目的父id，即

pages_id_list ={ {22, 4},{45,1},{1,1}, {4,4},{566,45},{7,7},{783,566}, {66,1},{300,8},{8,4},{101,7},{80,22}, {17,17},{911,66} }

在每个子列表中，数据的结构如下{*article\u id*，*parent\u id*} 如果article\u id和parent\u id相同，则显然意味着article没有父对象

我希望使用最少的代码对数据进行排序，这样，对于每一篇文章，我都可以访问其子代和子代（嵌套数据）列表（如果可用）。例如（使用上面的示例数据），我应该能够在一天结束时打印：

 1
 -45
 --566
 ---783
 -66
 --911

。。。。第1条

我只能整理出最高级别（Ist和第二代）的ID。在获得第三代和以后的后代时遇到问题

这是我使用的代码：

highest_level = set()
first_level = set()
sub_level = set()

for i in pages_id_list:
    id,pid = i['id'],i['pid']

    if id == pid:
        #Pages of the highest hierarchy
        highest_level.add(id)

for i in pages_id_list:
    id,pid = i['id'],i['pid']

    if id != pid :
        if pid in highest_level:
            #First child pages
            first_level.add(id)
        else:
            sub_level.add(id)

很遗憾，我的代码不起作用

如有任何正确方向的帮助/推动，将不胜感激。谢谢

David

这里有一个简单的方法（假设您的页面id列表元素没有设置，正如您的代码所建议的那样）：

产出将是：

\__1
  \__45
    \__566
      \__783
  \__66
    \__911
\__4
  \__8
    \__300
  \__22
    \__80
\__7
  \__101
\__17

来源：

这里有一个简单的方法（假设您的页面id列表元素没有设置，就像您的代码所建议的那样）：

产出将是：

\__1
  \__45
    \__566
      \__783
  \__66
    \__911
\__4
  \__8
    \__300
  \__22
    \__80
\__7
  \__101
\__17

来源：

可能是这样的：

#! /usr/bin/python3.2

pages_id_list = [ (22, 4),(45,1),(1,1), (4,4),(566,45),(7,7),(783,566), (66,1),(300,8),(8,4),(101,7),(80,22), (17,17),(911,66) ]

class Node:
    def __init__ (self, article):
        self.article = article
        self.children = []
        self.parent = None

    def print (self, level = 0):
        print ('{}{}'.format ('\t' * level, self.article) )
        for child in self.children: child.print (level + 1)

class Tree:
    def __init__ (self): self.nodes = {}

    def push (self, item):
        article, parent = item
        if parent not in self.nodes: self.nodes [parent] = Node (parent)
        if article not in self.nodes: self.nodes [article] = Node (article)
        if parent == article: return
        self.nodes [article].parent = self.nodes [parent]
        self.nodes [parent].children.append (self.nodes [article] )

    @property
    def roots (self): return (x for x in self.nodes.values () if not x.parent)

t = Tree ()
for i in pages_id_list: t.push (i)
for node in t.roots: node.print ()

这将创建一个树结构，您可以遍历它以获取所有子项。您可以通过

t.nodes[article]

访问任何文章，并通过

t.nodes[article].children

获取其子项

打印方法的输出为：

也许是这样的：

#! /usr/bin/python3.2

pages_id_list = [ (22, 4),(45,1),(1,1), (4,4),(566,45),(7,7),(783,566), (66,1),(300,8),(8,4),(101,7),(80,22), (17,17),(911,66) ]

class Node:
    def __init__ (self, article):
        self.article = article
        self.children = []
        self.parent = None

    def print (self, level = 0):
        print ('{}{}'.format ('\t' * level, self.article) )
        for child in self.children: child.print (level + 1)

class Tree:
    def __init__ (self): self.nodes = {}

    def push (self, item):
        article, parent = item
        if parent not in self.nodes: self.nodes [parent] = Node (parent)
        if article not in self.nodes: self.nodes [article] = Node (article)
        if parent == article: return
        self.nodes [article].parent = self.nodes [parent]
        self.nodes [parent].children.append (self.nodes [article] )

    @property
    def roots (self): return (x for x in self.nodes.values () if not x.parent)

t = Tree ()
for i in pages_id_list: t.push (i)
for node in t.roots: node.print ()

这将创建一个树结构，您可以遍历它以获取所有子项。您可以通过

t.nodes[article]

访问任何文章，并通过

t.nodes[article].children

获取其子项

打印方法的输出为：

我希望使用最少的代码对数据进行排序

我一直读到现在，因此我将提供另一个答案。我不会编辑我以前的答案，因为它们实际上并不相关。如果您希望将元组列表转换为具有最少代码的树结构，那么这种方法非常简单，尽管它仍然可以进一步最小化（例如，使用递归lambda项而不是函数）：

我希望使用最少的代码对数据进行排序

你试过使用

while

循环吗？你试过使用

while

循环吗？@Hyperboreus，我在回答中解释了这一点。Miku知道这一点很有用，尽管我更喜欢不需要模块的解决方案。以前从未对collections模块做过很多工作。Thanks@Hyperboreus我在回答中对此进行了解释。Miku知道这一点很有用，尽管我更喜欢不需要模块的解决方案。以前从未对collections模块做过很多工作。感谢+1的节点/树对象-我更容易理解。您甚至可以重写

\uu str\uu

函数，而不是使用

node.print（）

？还要记住，如果递归深度超过1000级，Python将抛出错误。（此部分：

child.print（级别+1）

）（）如果预计树的级别将超过1000级，则应取消对打印函数的初始化。+1用于节点/树对象-这更容易理解。您甚至可以重写

\uu str\uu

函数，而不是使用

node.print（）

？还要记住，如果递归深度超过1000级，Python将抛出错误。（此部分：

child.print（级别+1）

）（）如果预计树的级别将超过1000级，则应取消打印函数的初始化。感谢您的支持。这对我来说是最有用的，因为代码最少。请记住，计算成本是可怕的。谢谢你的夸大其词。这对我来说非常有用，因为代码非常少。请记住，计算成本非常高。