Python 如何将扁平化数据转换为结构化json?

Python 如何将扁平化数据转换为结构化json?,python,json,graph,tree,Python,Json,Graph,Tree,这是主展平元素,也称为输入数据: ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab'] [ { "title": "a", "children": [ { "title": "ab", "children": [

这是主展平元素,也称为输入数据:

['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
[
  {
    "title": "a",
    "children": [
      {
        "title": "ab",
        "children": [
          {
            "title": "aba",
            "children": [
              {
                "title": "abaa",
                "children": [
                  {
                    "title": "abaaa"
                  }
                ]
              },
              {
                "title": "abab"
              }
            ]
          }
        ]
      },
      {
        "title": "ac",
        "children": [
          {
            "title": "aca",
            "children": [
              {
                "title": "acaa"
              },
              {
                "title": "acab"
              }
            ]
          }
        ]
      }
    ]
  }
]
这是我需要的目标数据,也称为输出数据:

['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
[
  {
    "title": "a",
    "children": [
      {
        "title": "ab",
        "children": [
          {
            "title": "aba",
            "children": [
              {
                "title": "abaa",
                "children": [
                  {
                    "title": "abaaa"
                  }
                ]
              },
              {
                "title": "abab"
              }
            ]
          }
        ]
      },
      {
        "title": "ac",
        "children": [
          {
            "title": "aca",
            "children": [
              {
                "title": "acaa"
              },
              {
                "title": "acab"
              }
            ]
          }
        ]
      }
    ]
  }
]
我想我可以使用deepforloop迭代来生成这个json数据,但这太困难了,因为level的num将大于10。所以我认为for循环不能在这个过程中完成,是否有任何算法或使用打包的代码来实现一个函数来实现这个目标? 如果你能分享你的想法,我非常感激,上帝保佑你

以下是一个开始:

def populate_levels(dct, levels):
    if levels:
        if levels[0] not in dct:
            dct[levels[0]] = {}
        populate_levels(dct[levels[0]], levels[1:])


def create_final(dct):
    final = []
    for title in dct:
        final.append({"title": title, "children": create_final(dct[title])})
    return final


data = ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
template = {}

for item in data:
    populate_levels(template, item.split('-'))

final = create_final(template)
我看不出一种干净的方法可以一次完成这一切,所以我在
模板
字典之间创建了这个。现在,如果一个“节点”没有子节点,它对应的dict将包含
“子节点”:[]

如果愿意,可以在
create\u final
函数中更改此行为。

下面是一个使用itertools的递归解决方案。我不知道这对你来说是否足够有效,但它确实有效。它的工作原理是将字符串列表转换为列表列表,然后使用相同的第一个键将其划分为列表,然后构建dict并在删除第一个键的情况下重复

from itertools import groupby
from pprint import pprint

data = ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']
components = [x.split("-") for x in data]

def build_dict(component_list):
    key = lambda x: x[0]
    component_list = sorted(component_list, key=key)
    # divide into lists with the same fist key
    sublists = groupby(component_list, key)
    result = []

    for name, values in sublists:
        value = {}
        value["title"] = name
        value["children"] = build_dict([x[1:] for x in values if x[1:]])
        result.append(value)
    return result

pprint(build_dict(components))
输出:

[{'children': [{'children': [{'children': [{'children': [{'children': [],
                                                          'title': 'abaaa'}],
                                            'title': 'abaa'},
                                           {'children': [], 'title': 'abab'}],
                              'title': 'aba'}],
                'title': 'ab'},
               {'children': [{'children': [{'children': [], 'title': 'acaa'},
                                           {'children': [], 'title': 'acab'}],
                              'title': 'aca'}],
                'title': 'ac'}],
  'title': 'a'}]
[
  {
    "title": "a",
    "children": [
        {
            "title": "ab",
            "children": [
                {
                    "title": "aba",
                    "children": [
                        {
                            "title": "abaa",
                            "children": [
                                {
                                    "title": "abaaa",
                                    "children": []
                                }
                            ]
                        },
                        {
                            "title": "abab",
                            "children": []
                        }
                    ]
                }
            ]
        },
        {
            "title": "ac",
            "children": [
                {
                    "title": "aca",
                    "children": [
                        {
                            "title": "acaa",
                            "children": []
                        },
                        {
                            "title": "acab",
                            "children": []
                        }
                    ]
                }
             ]
          }
      ]
   }
]

要将此dict转换为json,可以使用json模块中的
json.dumps
。我希望我的解释清楚。

您可以使用
集合。defaultdict

from collections import defaultdict
def get_struct(d):
  _d = defaultdict(list)
  for a, *b in d:
     _d[a].append(b)
  return [{'title':a, 'children':get_struct(filter(None, b))} for a, b in _d.items()]

data = ['a-ab-aba-abaa-abaaa', 'a-ab-aba-abab', 'a-ac-aca-acaa', 'a-ac-aca-acab']

输出:

[{'children': [{'children': [{'children': [{'children': [{'children': [],
                                                          'title': 'abaaa'}],
                                            'title': 'abaa'},
                                           {'children': [], 'title': 'abab'}],
                              'title': 'aba'}],
                'title': 'ab'},
               {'children': [{'children': [{'children': [], 'title': 'acaa'},
                                           {'children': [], 'title': 'acab'}],
                              'title': 'aca'}],
                'title': 'ac'}],
  'title': 'a'}]
[
  {
    "title": "a",
    "children": [
        {
            "title": "ab",
            "children": [
                {
                    "title": "aba",
                    "children": [
                        {
                            "title": "abaa",
                            "children": [
                                {
                                    "title": "abaaa",
                                    "children": []
                                }
                            ]
                        },
                        {
                            "title": "abab",
                            "children": []
                        }
                    ]
                }
            ]
        },
        {
            "title": "ac",
            "children": [
                {
                    "title": "aca",
                    "children": [
                        {
                            "title": "acaa",
                            "children": []
                        },
                        {
                            "title": "acab",
                            "children": []
                        }
                    ]
                }
             ]
          }
      ]
   }
]

这是一个很好的解决方案,我认为您可以首先检查
len(component_list)>0
,然后for循环的主体将是一个单行程序。但无论如何+1)难以置信,这种方式很有效,万分感谢!太神了这种方法也很有效!太棒了,万分感谢!!!