从Python中的展平数据生成嵌套列表_Python_List_Recursion_Tree

从Python中的展平数据生成嵌套列表

python list recursion tree

从Python中的展平数据生成嵌套列表,python,list,recursion,tree,Python,List,Recursion,Tree,为了生成内容表，我在Python列表中提供了以下数据： data = [ {title: 'Section 1', level: 1, page_number: 1}, {title: 'Section 1.1', level: 2, page_number: 2}, {title: 'Section 1.2', level: 2, page_number: 3}, {title: 'Section 2', level: 1, page_number: 4},

为了生成内容表，我在Python列表中提供了以下数据：

data = [
    {title: 'Section 1', level: 1, page_number: 1},
    {title: 'Section 1.1', level: 2, page_number: 2},
    {title: 'Section 1.2', level: 2, page_number: 3},
    {title: 'Section 2', level: 1, page_number: 4},
    {title: 'Section 2.1', level: 2, page_number: 5},
    {title: 'Section 3', level: 1, page_number: 6},
]

由此，我希望获得这种嵌套结构，它与模板引擎的使用更加兼容：

toc = [
    {title: 'Section 1', page_number: 1, sub: [
        {title: 'Section 1.1', page_number: 2, sub: []},
        {title: 'Section 1.2', page_number: 3, sub: []},
    ]},
    {title: 'Section 2', page_number: 4, sub: [
        {title: 'Section 2.1', page_number: 5, sub: []},    
    ]},
    {title: 'Section 3', page_number: 6, sub: []},
]

如何实现这一点的提示？我尝试了一个递归函数，但对于我有限的大脑来说，它变得非常棘手

非常感谢您的帮助

编辑：添加了一个事实，即节条目最终可能没有子项。抱歉错过了。假设章节顺序正确，这意味着子章节始终在父章节之后，并且没有缺少父章节（跳过的级别）：

编辑：将其更改为有空的“子”项，其中没有子项。查看编辑历史记录中的另一个变体。

您可以浏览列表中的每个条目，并从中构建新列表。每当有“x.y节”时，您都会将其添加到sub

newData = []
curParent = None
for d in data:
  # child
  if d['title'].find('.') > 0:
    assert curParent # Make sure we have a valid parent dictionary
    curParent['sub'].append({'title': d['title'], 'page_number': d['page_number'])
  # parent
  else:
    curParent = {'title': d['title'], 'page_number': d['page_number'], 'sub': []}
    newData.append(curParent)

这应该适用于2或3个级别，如果您需要更多，那么使用不同的方法可能会更好。此外，find（“.”）可能不适用于其他标题，但可以使用level字段（在您的示例中似乎是多余的）或正则表达式

TITLE, LEVEL, PAGE_NUMBER, SUB = 'title', 'level', 'page_number', 'sub'
data = [
    {TITLE: 'Section 1', LEVEL: 1, PAGE_NUMBER: 1},
    {TITLE: 'Section 1.1', LEVEL: 2, PAGE_NUMBER: 2},
    {TITLE: 'Section 1.1.1', LEVEL: 3, PAGE_NUMBER: 2},
    {TITLE: 'Section 1.2', LEVEL: 2, PAGE_NUMBER: 3},
    {TITLE: 'Section 2', LEVEL: 1, PAGE_NUMBER: 4},
    {TITLE: 'Section 2.1', LEVEL: 2, PAGE_NUMBER: 5},
]

levels = [ { SUB: [] } ]
for section in data:
    section = dict(section)
    current = section[LEVEL]
    section[SUB] = []
    levels[current-1][SUB].append(section)
    del levels[current:]
    levels.append(section)

toc = levels[0][SUB]
from pprint import pprint
pprint(toc)

不需要递归，循环和类似堆栈的列表结构也可以工作。啊，我应该在语句中指定节也可以没有子节。。。但这里的东西很棒，值得深思嘿，你跑得很快。好吧，现在它可以像预期的那样工作了，非常感谢：）@NiKo区别不是一个部分不能有子项（这很清楚），而是如果你想在这种情况下有一个空的“sub”条目。我的第一个代码没有把空的“sub”放进去，这一个放进去了。它仍然是一个堆栈，只是看起来有点不同。我认为主要区别在于，您只保存我保存整个部分的每个级别中的子列表。这是否重要可能是一个有趣的问题。是的，我可以有许多嵌套的级别。另外，我真的不想依赖章节标题中的点；）对每一个不完美的答案都投否决票不是很好。这并不是我故意写错了什么，这只是另一种方法。

newData = []
curParent = None
for d in data:
  # child
  if d['title'].find('.') > 0:
    assert curParent # Make sure we have a valid parent dictionary
    curParent['sub'].append({'title': d['title'], 'page_number': d['page_number'])
  # parent
  else:
    curParent = {'title': d['title'], 'page_number': d['page_number'], 'sub': []}
    newData.append(curParent)

TITLE, LEVEL, PAGE_NUMBER, SUB = 'title', 'level', 'page_number', 'sub'
data = [
    {TITLE: 'Section 1', LEVEL: 1, PAGE_NUMBER: 1},
    {TITLE: 'Section 1.1', LEVEL: 2, PAGE_NUMBER: 2},
    {TITLE: 'Section 1.1.1', LEVEL: 3, PAGE_NUMBER: 2},
    {TITLE: 'Section 1.2', LEVEL: 2, PAGE_NUMBER: 3},
    {TITLE: 'Section 2', LEVEL: 1, PAGE_NUMBER: 4},
    {TITLE: 'Section 2.1', LEVEL: 2, PAGE_NUMBER: 5},
]

levels = [ { SUB: [] } ]
for section in data:
    section = dict(section)
    current = section[LEVEL]
    section[SUB] = []
    levels[current-1][SUB].append(section)
    del levels[current:]
    levels.append(section)

toc = levels[0][SUB]
from pprint import pprint
pprint(toc)