Algorithm 用python从平面列表创建嵌套字典_Algorithm_Python 3.x_Recursion

Algorithm 用python从平面列表创建嵌套字典

algorithm python-3.x recursion

Algorithm 用python从平面列表创建嵌套字典,algorithm,python-3.x,recursion,Algorithm,Python 3.x,Recursion,我有以下表格中的文件列表： base/images/graphs/one.png base/images/tikz/two.png base/refs/images/three.png base/one.txt base/chapters/two.txt 我想将它们转换为此类嵌套字典： { "name": "base" , "contents": [{"name": "images" , "contents": [{"name": "graphs", "contents":[{"

我有以下表格中的文件列表：

base/images/graphs/one.png
base/images/tikz/two.png
base/refs/images/three.png
base/one.txt
base/chapters/two.txt

我想将它们转换为此类嵌套字典：

{ "name": "base" , "contents": 
  [{"name": "images" , "contents":
    [{"name": "graphs", "contents":[{"name":"one.png"}] },
     {"name":"tikz",     "contents":[{"name":"two.png"}]}
    ]
   }, 
   {"name": "refs", "contents":
    [{"name":"images", "contents": [{"name":"three.png"}]}]
   },
   {"name":"one.txt",  },
   {"name": "chapters", "contents":[{"name":"two.txt"}]
  ]
 }

问题是，我尝试的解决方案是，给定一些输入，例如

images/datasetone/grapha.png”，“images/datasetone/graphb.png”

它们中的每一个都会出现在一个名为“datasetone”的不同字典中“但是，我希望两者都位于同一目录下的同一父词典中，当公共路径中有多个文件时，如何在不复制父词典的情况下创建此嵌套结构

以下是我的想法，但失败了：

def path_to_tree(params):
    start = {}
    for item in params:
        parts = item.split('/')
        depth = len(parts)
        if depth > 1: 
            if "contents" in start.keys():
                start["contents"].append(create_base_dir(parts[0],parts[1:]))
            else:
                start ["contents"] = [create_base_dir(parts[0],parts[1:]) ]
        else:
            if "contents" in start.keys():
                start["contents"].append(create_leaf(parts[0]))
            else:
                start["contents"] =[ create_leaf(parts[0]) ]
    return start


def create_base_dir(base, parts):
    l={}
    if len(parts) >=1:
        l["name"] = base 
        l["contents"] = [  create_base_dir(parts[0],parts[1:]) ]
    elif len(parts)==0:
        l = create_leaf(base)
    return l 


def create_leaf(base): 
    l={}
    l["name"] = base
    return l 

b=["base/images/graphs/one.png","base/images/graphs/oneb.png","base/images/tikz/two.png","base/refs/images/three.png","base/one.txt","base/chapters/two.txt"]
d =path_to_tree(b)
from pprint import pprint
pprint(d)

在本例中，您可以看到，列表中的文件数量与名为“base”的词典数量一样多，但只需要一个，子目录应列在“contents”数组中。

省略多余的

name

标记，您可以继续：

import json

result = {}

records = ["base/images/graphs/one.png", "base/images/tikz/two.png",
        "base/refs/images/three.png", "base/one.txt", "base/chapters/two.txt"]

recordsSplit = map(lambda x: x.split("/"), records)

for record in recordsSplit:
    here = result
    for item in record[:-1]:
        if not item in here:
            here[item] = {}
        here = here[item]
    if "###content###" not in here:
        here["###content###"] = []
    here["###content###"].append(record[-1])

print json.dumps(result, indent=4)

字符用于唯一性（在层次结构中可能有一个名为

content

的文件夹）。运行它，看看结果

编辑：修复了一些拼写错误，添加了输出。

这并不意味着所有路径都以相同的内容开头，因此我们需要一个列表：

from pprint import pprint
def addBits2Tree( bits, tree ):
    if len(bits) == 1:
        tree.append( {'name':bits[0]} )
    else:
        for t in tree:
            if t['name']==bits[0]:
                addBits2Tree( bits[1:], t['contents'] )
                return
        newTree = []
        addBits2Tree( bits[1:], newTree )
        t = {'name':bits[0], 'contents':newTree}
        tree.append( t )

def addPath2Tree( path, tree ):
    bits = path.split("/")
    addBits2Tree( bits, tree )

tree = []
for p in b:
    print p
    addPath2Tree( p, tree )
pprint(tree)

这将为您的示例路径列表生成以下内容：

[{'contents': [{'contents': [{'contents': [{'name': 'one.png'},
                                           {'name': 'oneb.png'}],
                              'name': 'graphs'},
                             {'contents': [{'name': 'two.png'}],
                              'name': 'tikz'}],
                'name': 'images'},
               {'contents': [{'contents': [{'name': 'three.png'}],
                              'name': 'images'}],
                'name': 'refs'},
               {'name': 'one.txt'},
               {'contents': [{'name': 'two.txt'}], 'name': 'chapters'}],
  'name': 'base'}]

请查看示例输出-

“contents”：[“name”：“one.png”]

没有意义为什么one.png没有内容，而one.txt有？不应该将one.txt视为一个目录吗？如果我通过语法检查器（pyflakes）运行您的示例数据，这就是它所说的

data.py:3:无效语法[{“name”：“graphs”，“contents”：[“name”：“one.png”]}

请修复示例数据！！！！@vorspring抱歉，我在构建问题时手写了它，我会更正它。for循环开始时的“结果”是什么？记录也未定义。打字错误如此严重，以至于答案需要被否决？