从d3可折叠树的两列数据帧(Python3)创建json层次结构树
我有一个包含两列的数据框:Employee和Reports\u to。每个员工都向某个人汇报,一直到首席执行官。我想将其转换为一个json文件,该文件可以被可折叠的d3树使用(根据这个伟大的链接:)。这将形成一个很好的组织结构图,只需很少/不需要手动操作即可保持最新 我已经能够将df转换为正确的json格式,如下面的简单示例所示。然而,我在Excel中非常痛苦地完成了这项工作,然后将从d3可折叠树的两列数据帧(Python3)创建json层次结构树,json,python-3.x,dictionary,tree,Json,Python 3.x,Dictionary,Tree,我有一个包含两列的数据框:Employee和Reports\u to。每个员工都向某个人汇报,一直到首席执行官。我想将其转换为一个json文件,该文件可以被可折叠的d3树使用(根据这个伟大的链接:)。这将形成一个很好的组织结构图,只需很少/不需要手动操作即可保持最新 我已经能够将df转换为正确的json格式,如下面的简单示例所示。然而,我在Excel中非常痛苦地完成了这项工作,然后将.append字符串复制并粘贴到Jupyter(!)中这是我的问题:在Python3中有没有一种优雅的方式将2列
.append
字符串复制并粘贴到Jupyter(!)中这是我的问题:在Python3中有没有一种优雅的方式将2列df转换成所需的dict?
import numpy as np
import pandas as pd
import json
#_Lx refers to the level in the organisation, where Jackie_L1 is the CEO
df = pd.DataFrame(np.array([
['Jo_L3','Jane_L2'],
['Jon_L3','Jane_L2'],
['James_L3','Jerry_L2'],
['Joan_L3','Jerry_L2'],
['Jane_L2','Jackie_L1'],
['Jerry_L2','Jackie_L1'],
['Jill_L2','Jackie_L1']]))
df.columns = ['Employee','Reports_to']
df
Employee Reports_to
Jo_L3 Jane_L2
Jon_L3 Jane_L2
James_L3 Jerry_L2
Joan_L3 Jerry_L2
Jane_L2 Jackie_L1
Jerry_L2 Jackie_L1
Jill_L2 Jackie_L1
#start with the root node and work over to the right (down the organisation) to provide the required json:
tree = {'parent': 'null', 'name': 'Jackie_L1', 'edge_name': 'Jackie_L1', 'children': []}
tree['children'].append({'parent': 'Jackie_L1', 'name': 'Jane_L2', 'edge_name': 'Jane_L2', 'children': []})
tree['children'].append({'parent': 'Jackie_L1', 'name': 'Jerry_L2', 'edge_name': 'Jerry_L2', 'children': []})
tree['children'].append({'parent': 'Jackie_L1', 'name': 'Jill_L2', 'edge_name': 'Jill_L2', 'children': []})
tree['children'][0]['children'].append({'parent': 'Jane_L2', 'name': 'Jo_L3', 'edge_name': 'Jo_L3', 'children': []})
tree['children'][0]['children'].append({'parent': 'Jane_L2', 'name': 'Jon_L3', 'edge_name': 'Jon_L3', 'children': []})
tree['children'][1]['children'].append({'parent': 'Jerry_L2', 'name': 'James_L3', 'edge_name': 'James_L3', 'children': []})
tree['children'][1]['children'].append({'parent': 'Jerry_L2', 'name': 'Joan_L3', 'edge_name': 'Joan_L3', 'children': []})
以下是d3树所需的结果dict:
{'parent': 'null',
'name': 'Jackie_L1',
'edge_name': 'Jackie_L1',
'children': [{'parent': 'Jackie_L1',
'name': 'Jane_L2',
'edge_name': 'Jane_L2',
'children': [{'parent': 'Jane_L2',
'name': 'Jo_L3',
'edge_name': 'Jo_L3',
'children': []},
{'parent': 'Jane_L2',
'name': 'Jon_L3',
'edge_name': 'Jon_L3',
'children': []}]},
{'parent': 'Jackie_L1',
'name': 'Jerry_L2',
'edge_name': 'Jerry_L2',
'children': [{'parent': 'Jerry_L2',
'name': 'James_L3',
'edge_name': 'James_L3',
'children': []},
{'parent': 'Jerry_L2',
'name': 'Joan_L3',
'edge_name': 'Joan_L3',
'children': []}]},
{'parent': 'Jackie_L1',
'name': 'Jill_L2',
'edge_name': 'Jill_L2',
'children': []}]}
我将tree
转换为json文件,如下所示:
with open('C:/Python37/input_graph_tree.json', 'w') as outfile:
json.dump(tree, outfile)
上面的链接提供了在桌面上运行可折叠树的说明,尽管您需要使用
python-mhttp.server8080
来启动它,而不是python-msimplehttpserver 8080
多亏了Jonathan Eunice,我找到了一种方法
…并按照说明在浏览器中根据OP显示d3树。这将为您提供OP中所示的树。如果树太大,这将失败,但您可以一次创建一个主分支,并在最后将它们合并到一棵树中(尽管不容易)。我找到了一种方法,感谢Jonathan Eunice …并按照说明在浏览器中根据OP显示d3树。这将为您提供OP中所示的树。如果树太大,这将失败,但您可以一次创建一个主分支,并在最后将它们合并到一棵树中(尽管不容易)
#using the df example above, add a row for the top person:
lastrow = len(df)
df.loc[lastrow] = np.nan
df.loc[lastrow, 'Employee'] = 'Jackie_L1'
df.loc[lastrow, 'Reports_to'] = '' #top person does not report to anyone
#create a new column called eid that is a copy of Employee (to make the 'buildtree' function below work):
df['eid'] = df['Employee']
df = df[['eid', 'Employee', 'Reports_to']] #get the order right
#then run this
from pprint import pprint
from collections import defaultdict
def show_val(title, val):
sep = '-' * len(title)
print ("\n{0}\n{1}\n{2}\n".format(sep, title, sep))
pprint(val)
def buildtree(t=None, parent_eid=''):
"""
Given a parents lookup structure, construct
a data hierarchy.
"""
parent = parents.get(parent_eid, None)
if parent is None:
return t
for eid, name, mid in parent:
if mid == '': report = {'parent': 'null', 'name': name, 'edge_name': name }
else : report = {'parent': mid, 'name': name, 'edge_name': name }
if t is None:
t = report
else:
reports = t.setdefault('children', [])
reports.append(report)
buildtree(report, eid)
return t
people = list(df.itertuples(index=False, name=None))
parents = defaultdict(list)
for p in people:
parents[p[2]].append(p)
tree = buildtree()
show_val("data", tree)
#which gives you:
----
data
----
{'children': [{'children': [{'edge_name': 'Jo_L3',
'name': 'Jo_L3',
'parent': 'Jane_L2'},
{'edge_name': 'Jon_L3',
'name': 'Jon_L3',
'parent': 'Jane_L2'}],
'edge_name': 'Jane_L2',
'name': 'Jane_L2',
'parent': 'Jackie_L1'},
{'children': [{'edge_name': 'James_L3',
'name': 'James_L3',
'parent': 'Jerry_L2'},
{'edge_name': 'Joan_L3',
'name': 'Joan_L3',
'parent': 'Jerry_L2'}],
'edge_name': 'Jerry_L2',
'name': 'Jerry_L2',
'parent': 'Jackie_L1'},
{'edge_name': 'Jill_L2',
'name': 'Jill_L2',
'parent': 'Jackie_L1'}],
'edge_name': 'Jackie_L1',
'name': 'Jackie_L1',
'parent': 'null'}
#then write to json:
with open('C:/Python37/input_graph_tree.json', 'w') as outfile:
json.dump(tree, outfile)