Hadoop 配置单元-将层次结构表展平为多个级别
我有一个父子关系最大级别为15的层次结构表。我需要找出每个父节点的所有子节点级别。 我尝试过递归查询,但它在Hive和Impala中不起作用。 请提出一些问题来解决这个问题Hadoop 配置单元-将层次结构表展平为多个级别,hadoop,hive,hiveql,impala,Hadoop,Hive,Hiveql,Impala,我有一个父子关系最大级别为15的层次结构表。我需要找出每个父节点的所有子节点级别。 我尝试过递归查询,但它在Hive和Impala中不起作用。 请提出一些问题来解决这个问题 +---------------+----------------+ |parent_node_id | child_node_id | +---------------+----------------+ | C1 | C11 | +---------------+------
+---------------+----------------+
|parent_node_id | child_node_id |
+---------------+----------------+
| C1 | C11 |
+---------------+----------------+
| C11 | C12 |
+---------------+----------------+
| C12 | 123 |
+---------------+----------------+
| C12 | 456 |
+---------------+----------------+
| P1 | C1 |
+---------------+----------------+
| P1 | C2 |
+---------------+----------------+
| P2 | C3 |
+---------------+----------------+
+-------+----------+
| node | balance |
+-------+----------+
| 123 | 100 |
+-------+----------+
| 456 | 150 |
+-------+----------+
| C2 | 400 |
+-------+----------+
| C3 | 200 |
+-------+----------+
输出数据:
+---------------+-------------+---------+----------+----------+
|Parent_node_id | Level 1 | Level 2 | Level 2 | Level3 |
+---------------+-------------+---------+----------+----------+
| P1 | C1 | C11 | C12 | 123 |
+---------------+-------------+---------+----------+----------+
| P1 | C1 | C11 | C12 | 456 |
+---------------+-------------+---------+----------+----------+
| P1 | C2 | NULL | NULL | NULL |
+---------------+-------------+---------+----------+----------+
| P2 | C3 | NULL | NULL | NULL |
+---------------+-------------+---------+----------+----------+
+-------+----------+
| node | balance |
+-------+----------+
| 123 | 100 |
+-------+----------+
| 456 | 150 |
+-------+----------+
| C12 | 250 |
+-------+----------+
| C11 | 250 |
+-------+----------+
| C1 | 250 |
+-------+----------+
| P1 | 650 |
+-------+----------+
| C2 | 200 |
+-------+----------+
| P2 | 200 |
+-------+----------+
因为您知道层次结构中可以存在的最大关系 下面是问题-
hive> select * from hier;
OK
parent child child_desc
C1 C11 Party
C1 C111 Placement
C111 C1111 XM
C12 C121 Brand
C2 C12 Equity
XN12 C1 Private
XN12 C2 Public
Time taken: 0.123 seconds, Fetched: 7 row(s)
质疑-
with lvl1 as
(
select p1.parent, p1.child child1, p1.child_desc child_desc1, p2.child child2, p2.child_desc child_desc2
from hier p1 left outer join hier p2
on p1.child = p2.parent
)
, lvl2 as
(
select lvl1.parent, lvl1.child1, lvl1.child_desc1, lvl1.child2 child2, lvl1.child_desc2, p3.parent p2,
p3.child as child3, p3.child_desc child_desc3
from lvl1 left outer join hier p3 on nvl(lvl1.child2,'A') = nvl(p3.parent,'A')
)
, lvl3 as
(
select lvl2.parent, lvl2.child1 child1, lvl2.child_desc1, lvl2.child2 child2, lvl2.child_desc2,
lvl2.child3 hild3, lvl2.child_desc3, p3.child as child4, p3.child as child_desc4
from lvl2 left outer join hier p3 on nvl(lvl2.child3,'A') = nvl(p3.parent,'A')
)
select * from lvl3;
C1 C11 Party NULL NULL NULL NULL NULL NULL
C1 C111 Placement C1111 XM NULL NULL NULL NULL
C111 C1111 XM NULL NULL NULL NULL NULL NULL
C12 C121 Brand NULL NULL NULL NULL NULL NULL
C2 C12 Equity C121 Brand NULL NULL NULL NULL
XN12 C1 Private C11 Party NULL NULL NULL NULL
XN12 C1 Private C111 Placement C1111 XM NULL NULL
XN12 C2 Public C12 Equity C121 Brand NULL NULL
上面的查询是针对级别3的,您可以将其复制到级别15
我已经对上述查询进行了3个级别的测试
hive> select * from hier;
OK
parent child
C1 C11
C11 C12
C12 123
P1 C1
P2 C2
o/p-
如果您希望查询n个级别,并且不想硬编码查询,那么需要编写一些脚本,并进行大量工作来创建查询或多次迭代同一查询
希望这有帮助对于这个问题,我想输出中应该只有两行。从顶级父级到最终子级的级别。P1 C1 C11 C12 123 P2 C2 NULL NULL NULL我已经更新了这个问题,因为我需要根据我们生成的层次结构级别找到平衡。请帮助我更新查询以获得此结果。提前谢谢你。请帮我实现这个结果。我真的需要这方面的支持。期待您的回复。请帮助解决这个问题。谢谢我为这个问题陈述创建了新的问题。请帮助解决这个问题。
C1 C11 C12 123 NULL
C11 C12 123 NULL NULL
C12 123 NULL NULL NULL
P1 C1 C11 C12 123
P2 C2 NULL NULL NULL