Google bigquery 如何在BigQuery中显示多层次树结构

Google bigquery 如何在BigQuery中显示多层次树结构,google-bigquery,hierarchy,Google Bigquery,Hierarchy,我正在研究主管及其受监管员工的树状层次结构。困难在于,有些主管是由其他主管监督的员工,而且数量很多 对于我从类中获得的SQL查询,只涉及简单的自联接,这可能只类似于两个级别:A由B监督,仅此而已 但现实世界的问题要复杂得多。有多个级别,我不确定确切的数字。例如,A由B监督,B由C监督,C由D监督,等等 我假设只有5个或更多级别的监管。原始数据可能如下所示: Employee Supervisor A B C

我正在研究主管及其受监管员工的树状层次结构。困难在于,有些主管是由其他主管监督的员工,而且数量很多

对于我从类中获得的SQL查询,只涉及简单的自联接,这可能只类似于两个级别:A由B监督,仅此而已

但现实世界的问题要复杂得多。有多个级别,我不确定确切的数字。例如,A由B监督,B由C监督,C由D监督,等等

我假设只有5个或更多级别的监管。原始数据可能如下所示:

     Employee     Supervisor
        A             B
        C             B
        D             B
        B             V
        E             V
        F             E
        G             V
        V          (Blank which indicates no boss)
        H             A
Row Supervisor  Employee1   Employee2    
1   V           B           A    
2   V           B           C    
3   V           B           D    
4   V           E           F    
5   V           G           null  
一些BigQuery专家提供的代码如下:

#standardSQL
SELECT t.Supervisor,
  IF(t.Supervisor = t5.Supervisor, 
    STRUCT(Employee2 AS Employee1, NULL AS Employee2),
    STRUCT(t5.Supervisor AS Employee1, Employee2 AS Employee2)
  ).*
FROM (
  SELECT t1.Employee Supervisor,
    COALESCE(t4.Employee, t3.Employee, t2.Employee) Employee2
  FROM `project.dataset.table` t1
  LEFT JOIN `project.dataset.table` t2 ON t2.Supervisor = t1.Employee
  LEFT JOIN `project.dataset.table` t3 ON t3.Supervisor = t2.Employee
  LEFT JOIN `project.dataset.table` t4 ON t4.Supervisor = t3.Employee
  WHERE t1.Supervisor IS NULL
) t
LEFT JOIN `project.dataset.table` t5 ON t5.Employee = t.Employee2
结果是这样的:

     Employee     Supervisor
        A             B
        C             B
        D             B
        B             V
        E             V
        F             E
        G             V
        V          (Blank which indicates no boss)
        H             A
Row Supervisor  Employee1   Employee2    
1   V           B           A    
2   V           B           C    
3   V           B           D    
4   V           E           F    
5   V           G           null  
但我们想要的是:

  Row Supervisor  Employee1   Employee2  Employee3  
    1   V           B           A          H
    2   V           B           C         Null
    3   V           B           D         Null
    4   V           E           F         Null
    5   V           G           null      Null

如果我想拥有更多层次结构,那么如何更改代码?也就是说,如果我想添加employee3或4,我如何编辑它?谢谢

下面是BigQuery标准SQL

#standardSQL
WITH e0 AS (
  SELECT Employee AS Supervisor FROM `project.dataset.table` WHERE Supervisor IS NULL
), e1 AS (
  SELECT e.Supervisor, Employee AS Employee1 
  FROM e0 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Supervisor
), e2 AS (
  SELECT e.Supervisor, Employee1, Employee AS Employee2
  FROM e1 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee1
), e3 AS (
  SELECT e.Supervisor, Employee1, Employee2, Employee AS Employee3
  FROM e2 e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee2
)
SELECT * FROM e3   
如果要应用于问题中的样本数据-结果/输出为

Row Supervisor  Employee1   Employee2   Employee3    
1   V           B           A           H    
2   V           B           C           null     
3   V           B           D           null     
4   V           E           F           null     
5   V           G           null        null       
您可以轻松地在上面扩展,添加更多级别,如下面(用相应的数字替换和,如4、5、6、7等),显然可以达到合理的扩展

e<N> AS (
  SELECT e.Supervisor, Employee1, Employee2, Employee3, ... , Employee AS Employee<N>
  FROM e<N-1> e LEFT JOIN `project.dataset.table` t ON t.Supervisor = e.Employee<N-1>
)   
SELECT * FROM e<N>     

它在理论上是有效的。但在我的实际工作表中,我发现一些员工没有被指派任何主管,这使得分析非常复杂。这很可能意味着你的初始数据不完整,或者有其他需要考虑的因素。因此,您可能需要重新审视您的案例并发布下一个新帖子/问题。同时,我认为你目前的问题已经完全解决了——考虑接受(如果还没有投票)-见下一篇文章:O)