Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/283.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 在树上识别根父母及其所有子女_Python_Pandas - Fatal编程技术网

Python 在树上识别根父母及其所有子女

Python 在树上识别根父母及其所有子女,python,pandas,Python,Pandas,我有一个熊猫数据框: parent child parent_level child_level A B 0 1 B C 1 2 B D 1 2 X Y 0 2 X D 0 2 Y Z 2

我有一个熊猫数据框:

parent   child   parent_level   child_level
A        B       0              1
B        C       1              2
B        D       1              2
X        Y       0              2
X        D       0              2 
Y        Z       2              3
这表示一棵树,看起来像这样

       A  X
      /  / \
     B  /   \
    /\ /     \
   C  D       Y
              |
              Z
我想制作一些类似这样的东西:

root    children
A       [B,C,D]
X       [D,Y,Z]

没有循环的最快方法是什么。我有一个非常大的数据帧。

我建议您使用,因为这是一个图形问题。特别是功能:

import networkx as nx
import pandas as pd

data = [['A', 'B', 0, 1],
        ['B', 'C', 1, 2],
        ['B', 'D', 1, 2],
        ['X', 'Y', 0, 2],
        ['X', 'D', 0, 2],
        ['Y', 'Z', 2, 3]]

df = pd.DataFrame(data=data, columns=['parent', 'child', 'parent_level', 'child_level'])

roots = df.parent[df.parent_level.eq(0)].unique()
dg = nx.from_pandas_edgelist(df, source='parent', target='child', create_using=nx.DiGraph)

result = pd.DataFrame(data=[[root, nx.descendants(dg, root)] for root in roots], columns=['root', 'children'])
print(result)
输出

  root   children
0    A  {D, B, C}
1    X  {Z, Y, D}
我建议您使用,因为这是一个图形问题。特别是功能:

import networkx as nx
import pandas as pd

data = [['A', 'B', 0, 1],
        ['B', 'C', 1, 2],
        ['B', 'D', 1, 2],
        ['X', 'Y', 0, 2],
        ['X', 'D', 0, 2],
        ['Y', 'Z', 2, 3]]

df = pd.DataFrame(data=data, columns=['parent', 'child', 'parent_level', 'child_level'])

roots = df.parent[df.parent_level.eq(0)].unique()
dg = nx.from_pandas_edgelist(df, source='parent', target='child', create_using=nx.DiGraph)

result = pd.DataFrame(data=[[root, nx.descendants(dg, root)] for root in roots], columns=['root', 'children'])
print(result)
输出

  root   children
0    A  {D, B, C}
1    X  {Z, Y, D}
递归
您也可以将
find_root
设置为生成器

def find_root(tree, child):
    if child in tree:
        for x in tree[child]:
            yield from find_root(tree, x)
    else:
        yield child
此外,如果要避免递归深度问题,可以使用定义
查找\u root

def find_root(tree, child):
    stack = [iter([child])]
    while stack:
        for node in stack[-1]:
            if node in tree:
                stack.append(iter(tree[node]))
            else:
                yield node
            break
        else:  # yes!  that is an `else` clause on a for loop
            stack.pop()
递归
您也可以将
find_root
设置为生成器

def find_root(tree, child):
    if child in tree:
        for x in tree[child]:
            yield from find_root(tree, x)
    else:
        yield child
此外,如果要避免递归深度问题,可以使用定义
查找\u root

def find_root(tree, child):
    stack = [iter([child])]
    while stack:
        for node in stack[-1]:
            if node in tree:
                stack.append(iter(tree[node]))
            else:
                yield node
            break
        else:  # yes!  that is an `else` clause on a for loop
            stack.pop()