Hadoop 在蜂箱中连接一张桌子两次
我不知道如何在Hive中实现它。请指路 假设我有这样的蜂巢表Hadoop 在蜂箱中连接一张桌子两次,hadoop,hive,hiveql,apache-hive,Hadoop,Hive,Hiveql,Apache Hive,我不知道如何在Hive中实现它。请指路 假设我有这样的蜂巢表 Table1: id | primary | secondary ------------------------- 1 | A | [B,C] 2 | B | [A] 3 | C | [A,B] Table2 id | primary | Heart | Liver | Kidney ------------------------------------- 1 | A
Table1:
id | primary | secondary
-------------------------
1 | A | [B,C]
2 | B | [A]
3 | C | [A,B]
Table2
id | primary | Heart | Liver | Kidney
-------------------------------------
1 | A | 1.3 | 0 | 0.9
2 | B | 0 | 3 | 2.3
3 | C | 0 | 9 | 4.3
Table3
id | medicine | primary
-----------------------
1 | M1 | A
2 | M2 | A
3 | M3 | B
4 | M4 | B
5 | M1 | B
现在从这些表中,如果表1的主要值(例如A)和次要值(例如:B)存在于表2中,并且如果至少存在一个值大于零的组织(此处为肾脏),则我必须打印以下内容(样本输出)
有人能帮我如何在Hive中编写查询吗
提前感谢。将表2转换为以下格式:
Table2
id | primary | Tissues
-------------------------------------
1 | A | Heart, Kidney
2 | B | Liver, Kidney
3 | C | Liver, Kidney
Table3
id | primary | drug_list
-------------------------
1 | A | M1,M2
2 | B | M1,M3,M4
将表3转换为以下格式:
Table2
id | primary | Tissues
-------------------------------------
1 | A | Heart, Kidney
2 | B | Liver, Kidney
3 | C | Liver, Kidney
Table3
id | primary | drug_list
-------------------------
1 | A | M1,M2
2 | B | M1,M3,M4
从表1中,我创建了两个表
将以上两个表合并,根据常见的组织值查找结果。我找到了解决方案。谢谢