如何使用条件从sql表中删除重复项
我有下面的sql表,我想在对象id与团队名称匹配时删除重复的条目。基本上,我想要对象的唯一值 我想将上表转换为如下所示如何使用条件从sql表中删除重复项,sql,apache-spark-sql,Sql,Apache Spark Sql,我有下面的sql表,我想在对象id与团队名称匹配时删除重复的条目。基本上,我想要对象的唯一值 我想将上表转换为如下所示 session_id object_id team_name user_name user_desc ---------- --------- --------- --------- ----------------- session1 user1 team1 user1 user1_description
session_id object_id team_name user_name user_desc
---------- --------- --------- --------- -----------------
session1 user1 team1 user1 user1_description
session1 user2 team1 user2 user2_description
session1 team1 team1 null null
如何实现这一点?如果我理解正确,您可以使用聚合:
select (case when min(session_id) = max(session_id) then min(session_id) end) as session_id,
object_id,
(case when min(team_name) = max(team_name) then min(team_name) end) as team_name,
(case when min(user_name) = max(user_name) then min(user_name) end) as user_name,
(case when min(user_desc) = max(user_desc) then min(user_desc) end) as user_desc
from t
group by object_id;
这里的团队id是什么?在模式中没有提到它
select (case when min(session_id) = max(session_id) then min(session_id) end) as session_id,
object_id,
(case when min(team_name) = max(team_name) then min(team_name) end) as team_name,
(case when min(user_name) = max(user_name) then min(user_name) end) as user_name,
(case when min(user_desc) = max(user_desc) then min(user_desc) end) as user_desc
from t
group by object_id;