Python 在树上识别根父母及其所有子女
我有一个熊猫数据框:Python 在树上识别根父母及其所有子女,python,pandas,Python,Pandas,我有一个熊猫数据框: parent child parent_level child_level A B 0 1 B C 1 2 B D 1 2 X Y 0 2 X D 0 2 Y Z 2
parent child parent_level child_level
A B 0 1
B C 1 2
B D 1 2
X Y 0 2
X D 0 2
Y Z 2 3
这表示一棵树,看起来像这样
A X
/ / \
B / \
/\ / \
C D Y
|
Z
我想制作一些类似这样的东西:
root children
A [B,C,D]
X [D,Y,Z]
或
没有循环的最快方法是什么。我有一个非常大的数据帧。我建议您使用,因为这是一个图形问题。特别是功能:
import networkx as nx
import pandas as pd
data = [['A', 'B', 0, 1],
['B', 'C', 1, 2],
['B', 'D', 1, 2],
['X', 'Y', 0, 2],
['X', 'D', 0, 2],
['Y', 'Z', 2, 3]]
df = pd.DataFrame(data=data, columns=['parent', 'child', 'parent_level', 'child_level'])
roots = df.parent[df.parent_level.eq(0)].unique()
dg = nx.from_pandas_edgelist(df, source='parent', target='child', create_using=nx.DiGraph)
result = pd.DataFrame(data=[[root, nx.descendants(dg, root)] for root in roots], columns=['root', 'children'])
print(result)
输出
root children
0 A {D, B, C}
1 X {Z, Y, D}
我建议您使用,因为这是一个图形问题。特别是功能:
import networkx as nx
import pandas as pd
data = [['A', 'B', 0, 1],
['B', 'C', 1, 2],
['B', 'D', 1, 2],
['X', 'Y', 0, 2],
['X', 'D', 0, 2],
['Y', 'Z', 2, 3]]
df = pd.DataFrame(data=data, columns=['parent', 'child', 'parent_level', 'child_level'])
roots = df.parent[df.parent_level.eq(0)].unique()
dg = nx.from_pandas_edgelist(df, source='parent', target='child', create_using=nx.DiGraph)
result = pd.DataFrame(data=[[root, nx.descendants(dg, root)] for root in roots], columns=['root', 'children'])
print(result)
输出
root children
0 A {D, B, C}
1 X {Z, Y, D}
递归
您也可以将
find_root
设置为生成器
def find_root(tree, child):
if child in tree:
for x in tree[child]:
yield from find_root(tree, x)
else:
yield child
此外,如果要避免递归深度问题,可以使用定义查找\u root
def find_root(tree, child):
stack = [iter([child])]
while stack:
for node in stack[-1]:
if node in tree:
stack.append(iter(tree[node]))
else:
yield node
break
else: # yes! that is an `else` clause on a for loop
stack.pop()
递归
您也可以将
find_root
设置为生成器
def find_root(tree, child):
if child in tree:
for x in tree[child]:
yield from find_root(tree, x)
else:
yield child
此外,如果要避免递归深度问题,可以使用定义查找\u root
def find_root(tree, child):
stack = [iter([child])]
while stack:
for node in stack[-1]:
if node in tree:
stack.append(iter(tree[node]))
else:
yield node
break
else: # yes! that is an `else` clause on a for loop
stack.pop()