Python 用于循环打包的数据帧中的特定嵌套字典_Python_Json_Pandas

Python 用于循环打包的数据帧中的特定嵌套字典

python json pandas

Python 用于循环打包的数据帧中的特定嵌套字典,python,json,pandas,Python,Json,Pandas,我正试图在特定条件下从数据帧创建一个特定的嵌套字典，以便可视化 dat = pd.DataFrame({'cat_1' : ['marketing', 'marketing', 'marketing', 'communications'], 'child_cat' : ['marketing', 'social media', 'marketing', 'communications], 'skill' : ['di

我正试图在特定条件下从数据帧创建一个特定的嵌套字典，以便可视化

dat = pd.DataFrame({'cat_1' : ['marketing', 'marketing', 'marketing', 'communications'],
                    'child_cat' : ['marketing', 'social media', 'marketing', 'communications],
                    'skill' : ['digital marketing','media marketing','research','seo'],
                    'value' : ['80', '101', '35', '31']

我想把它变成一本字典，看起来有点像这样：

{
  "name": "general skills",
  "children": [
    {
      "name": "marketing",
      "children": [
        {
          "name": "marketing",
          "children": [
            {
              "name": "digital marketing",
              "value": 80
            },
            {
              "name": "research",
              "value": 35
            }
          ]
        },
        {
          "name": "social media", // notice that this is a sibling of the parent marketing
          "children": [
            {
              "name": "media marketing",
              "value": 101
            }
          ]
        }
      ]
    },
    {
      "name": "communications",
      "children": [
        {
          "name": "communications",
          "children": [
            {
              "name": "seo",
              "value": 31
            }
          ]
        }
      ]
    }
  ]
}

因此，

cat_1

是父节点，

child_cat

是它的子节点，

skill

也是它的子节点。我在创建附加的

子项

列表时遇到问题。有什么帮助吗？

由于效率低下，我想出了这个解决方案。可能是高度次优

final = {}
# control dict to get only one broad category

contrl_dict = {}
contrl_dict['dummy'] = None
final['name'] = 'variants'
final['children'] = []

# line is the values of each row
for idx, line in enumerate(df_dict.values):
    # parent categories dict
    broad_dict_1 = {}
    print(line)

    # this takes every value of the row minus the value in the end
    for jdx, col in enumerate(line[:-1]):
        # look into the broad category first
        if jdx == 0:
            # check in our control dict - does this category exist? if not add it and continue
            if not col in contrl_dict.keys():

                # if it doesn't it appends it
                contrl_dict[col] = 'added'
                # then the broad dict parent takes the name

                broad_dict_1['name'] = col
                # the children are the children broad categories which will be populated further
                broad_dict_1['children'] = []
                # go to broad categories 2

                for ydx, broad_2 in enumerate(list(df_dict[df_dict.broad_categories == col].broad_2.unique())):
                    # sub categories dict
                    prov_dict = {}

                    prov_dict['name'] = broad_2
                    # children is again a list
                    prov_dict['children'] = []

                    # now isolate the skills and values of each broad_2 category and append them
                    for row in df_dict[df_dict.broad_2 == broad_2].values:
                        prov_d_3 = {}
                    # go to each row
                        for xdx, direct in enumerate(row):
                        # in each row, values 2 and 3 are name and value respectively add them

                            if xdx == 2:
                                prov_d_3['name'] = direct
                            if xdx == 3:
                                prov_d_3['size'] = direct

                        prov_dict['children'].append(prov_d_3)


                    broad_dict_1['children'].append(prov_dict)



        # if it already exists in the control dict then it moves on
        else:
            continue

    final['children'].append(broad_dict_1)