Python 如何将扁平化数据转换为结构化json?
这是主展平元素,也称为输入数据:Python 如何将扁平化数据转换为结构化json?,python,json,graph,tree,Python,Json,Graph,Tree,这是主展平元素,也称为输入数据: ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab'] [ { "title": "a", "children": [ { "title": "ab", "children": [
['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
[
{
"title": "a",
"children": [
{
"title": "ab",
"children": [
{
"title": "aba",
"children": [
{
"title": "abaa",
"children": [
{
"title": "abaaa"
}
]
},
{
"title": "abab"
}
]
}
]
},
{
"title": "ac",
"children": [
{
"title": "aca",
"children": [
{
"title": "acaa"
},
{
"title": "acab"
}
]
}
]
}
]
}
]
这是我需要的目标数据,也称为输出数据:
['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
[
{
"title": "a",
"children": [
{
"title": "ab",
"children": [
{
"title": "aba",
"children": [
{
"title": "abaa",
"children": [
{
"title": "abaaa"
}
]
},
{
"title": "abab"
}
]
}
]
},
{
"title": "ac",
"children": [
{
"title": "aca",
"children": [
{
"title": "acaa"
},
{
"title": "acab"
}
]
}
]
}
]
}
]
我想我可以使用deepforloop迭代来生成这个json数据,但这太困难了,因为level的num将大于10。所以我认为for循环不能在这个过程中完成,是否有任何算法或使用打包的代码来实现一个函数来实现这个目标?
如果你能分享你的想法,我非常感激,上帝保佑你 以下是一个开始:
def populate_levels(dct, levels):
if levels:
if levels[0] not in dct:
dct[levels[0]] = {}
populate_levels(dct[levels[0]], levels[1:])
def create_final(dct):
final = []
for title in dct:
final.append({"title": title, "children": create_final(dct[title])})
return final
data = ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
template = {}
for item in data:
populate_levels(template, item.split('-'))
final = create_final(template)
我看不出一种干净的方法可以一次完成这一切,所以我在模板
字典之间创建了这个。现在,如果一个“节点”没有子节点,它对应的dict将包含“子节点”:[]
如果愿意,可以在
create\u final
函数中更改此行为。下面是一个使用itertools的递归解决方案。我不知道这对你来说是否足够有效,但它确实有效。它的工作原理是将字符串列表转换为列表列表,然后使用相同的第一个键将其划分为列表,然后构建dict并在删除第一个键的情况下重复
from itertools import groupby
from pprint import pprint
data = ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
components = [x.split("-") for x in data]
def build_dict(component_list):
key = lambda x: x[0]
component_list = sorted(component_list, key=key)
# divide into lists with the same fist key
sublists = groupby(component_list, key)
result = []
for name, values in sublists:
value = {}
value["title"] = name
value["children"] = build_dict([x[1:] for x in values if x[1:]])
result.append(value)
return result
pprint(build_dict(components))
输出:
[{'children': [{'children': [{'children': [{'children': [{'children': [],
'title': 'abaaa'}],
'title': 'abaa'},
{'children': [], 'title': 'abab'}],
'title': 'aba'}],
'title': 'ab'},
{'children': [{'children': [{'children': [], 'title': 'acaa'},
{'children': [], 'title': 'acab'}],
'title': 'aca'}],
'title': 'ac'}],
'title': 'a'}]
[
{
"title": "a",
"children": [
{
"title": "ab",
"children": [
{
"title": "aba",
"children": [
{
"title": "abaa",
"children": [
{
"title": "abaaa",
"children": []
}
]
},
{
"title": "abab",
"children": []
}
]
}
]
},
{
"title": "ac",
"children": [
{
"title": "aca",
"children": [
{
"title": "acaa",
"children": []
},
{
"title": "acab",
"children": []
}
]
}
]
}
]
}
]
要将此dict转换为json,可以使用json模块中的
json.dumps
。我希望我的解释清楚。您可以使用集合。defaultdict
:
from collections import defaultdict
def get_struct(d):
_d = defaultdict(list)
for a, *b in d:
_d[a].append(b)
return [{'title':a, 'children':get_struct(filter(None, b))} for a, b in _d.items()]
data = ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
输出:
[{'children': [{'children': [{'children': [{'children': [{'children': [],
'title': 'abaaa'}],
'title': 'abaa'},
{'children': [], 'title': 'abab'}],
'title': 'aba'}],
'title': 'ab'},
{'children': [{'children': [{'children': [], 'title': 'acaa'},
{'children': [], 'title': 'acab'}],
'title': 'aca'}],
'title': 'ac'}],
'title': 'a'}]
[
{
"title": "a",
"children": [
{
"title": "ab",
"children": [
{
"title": "aba",
"children": [
{
"title": "abaa",
"children": [
{
"title": "abaaa",
"children": []
}
]
},
{
"title": "abab",
"children": []
}
]
}
]
},
{
"title": "ac",
"children": [
{
"title": "aca",
"children": [
{
"title": "acaa",
"children": []
},
{
"title": "acab",
"children": []
}
]
}
]
}
]
}
]
这是一个很好的解决方案,我认为您可以首先检查
len(component_list)>0
,然后for循环的主体将是一个单行程序。但无论如何+1)难以置信,这种方式很有效,万分感谢!太神了这种方法也很有效!太棒了,万分感谢!!!