Sql 从发生重复键匹配的联接中排除结果
我处理的数据来自多个我无法控制的来源。这些源的“键”值往往重复。我需要在联接中保持这些重复值中的任何一个形式匹配 使用以下数据Sql 从发生重复键匹配的联接中排除结果,sql,sql-server,tsql,Sql,Sql Server,Tsql,我处理的数据来自多个我无法控制的来源。这些源的“键”值往往重复。我需要在联接中保持这些重复值中的任何一个形式匹配 使用以下数据 T1 | ID | FirstKey | SecondKey | ThirdKey | AdditionalColumns | +----+----------+-----------+----------+---------------------+ | 01 | Prod1 | ABC1 | 201 | Jun 2010, A, 101
T1
| ID | FirstKey | SecondKey | ThirdKey | AdditionalColumns |
+----+----------+-----------+----------+---------------------+
| 01 | Prod1 | ABC1 | 201 | Jun 2010, A, 101 |
| 02 | Prod2 | DEF2 | 202 | May 2009, A, 101 |
| 03 | Prod2 | DEF2 | 202 | May 2010, S, 101 |
| 04 | Prod3 | | 206 | Jun 2010, A, 103 |
| 05 | Prod4 | | 207 | Jun 2011, S, 103 |
T2
| ID | FirstKey | SecondKey | ThirdKey | AdditionalColumns |
+----+----------+-----------+----------+---------------------+
| 01 | Prod1 | ABC1 | 201 | Jun 2010, A, 101 |
| 02 | Prod2 | DEF2 | | May 2009, A, 101 |
| 03 | Prod2 | DEF2 | 202 | May 2010, S, 101 |
| 04 | Prod3 | | | Jun 2010, A, 103 |
| 05 | Prod4 | | 207 | Jun 2011, S, 103 |
| 06 | Prod1 | ABC1 | 201 | Jun 2010, T, 101 |
现在,如果我们进行查询:
SELECT
T1.FirstKey, T1.SecondKey, T1.ThirdKey,
T2.FirstKey, T2.SecondKey, T2.ThirdKey,
T1.AdditionalColumns, T2.AdditionalColumns
FROM
T1 JOIN T2 ON T1.FirstKey = T2.FirstKey
AND T1.SecondKey = T2.SecondKey
AND T1.SecondKey IS NOT NULL
UNION
SELECT
T1.FirstKey, T1.SecondKey, T1.ThirdKey,
T2.FirstKey, T2.SecondKey, T2.ThirdKey,
T1.AdditionalColumns, T2.AdditionalColumns
FROM
T1 JOIN T2 ON T1.FirstKey = T2.FirstKey
AND T1.ThirdKey = T2.ThirdKey
AND T1.SecondKey IS NULL
我们得到以下结果
FirstKey SecondKey ThirdKey FirstKey SecondKey ThirdKey AdditionalColumns AdditionalColumns
-------- --------- -------- -------- --------- -------- ----------------- -----------------
Prod1 ABC1 201 Prod1 ABC1 201 Jun 2010, A, 101 Jun 2010, A, 101
Prod1 ABC1 201 Prod1 ABC1 201 Jun 2010, A, 101 Jun 2010, T, 101
Prod2 DEF2 202 Prod2 DEF2 202 May 2009, A, 101 May 2010, S, 101
Prod2 DEF2 202 Prod2 DEF2 202 May 2010, S, 101 May 2010, S, 101
Prod4 NULL 207 Prod4 NULL 207 Jun 2011, S, 103 Jun 2011, A, 103
我需要查询只返回具有权威匹配的记录。e、 g.两个表之间只有1个匹配项
FirstKey SecondKey ThirdKey FirstKey SecondKey ThirdKey AdditionalColumns AdditionalColumns
-------- --------- -------- -------- --------- -------- ----------------- -----------------
Prod4 NULL 207 Prod4 NULL 207 Jun 2011, S, 103 Jun 2011, A, 103
在连接中有没有这样做的方法
目前,我可以通过为每个表生成CTE来获得唯一性,从而保证联接中使用的键的唯一性。这是可行的,但很难看,并为查询添加了大量工作
是否有其他方法来执行此联接,以排除重复的匹配项?这假设我不能基于AdditionalColumns数据以编程方式排除任何重复的行
我一次又一次地遇到这个问题,所以CTE方法看起来很困难,因为它一定是一个已经解决的问题。在查询中使用GROUP BY如何:
SELECT T1.FirstKey, T1.SecondKey, T1.ThirdKey, T2.FirstKey, T2.SecondKey, T2.ThirdKey, T1.AdditionalColumns, T2.AdditionalColumns, COUNT(*)
FROM (
SELECT
T1.FirstKey, T1.SecondKey, T1.ThirdKey,
T2.FirstKey, T2.SecondKey, T2.ThirdKey,
T1.AdditionalColumns, T2.AdditionalColumns
FROM
T1 JOIN T2 ON T1.FirstKey = T2.FirstKey
AND T1.SecondKey = T2.SecondKey
AND T1.SecondKey IS NOT NULL
UNION
SELECT
T1.FirstKey, T1.SecondKey, T1.ThirdKey,
T2.FirstKey, T2.SecondKey, T2.ThirdKey,
T1.AdditionalColumns, T2.AdditionalColumns
FROM
T1 JOIN T2 ON T1.FirstKey = T2.FirstKey
AND T1.ThirdKey = T2.ThirdKey
AND T1.SecondKey IS NULL
)
GROUP BY T1.FirstKey, T1.SecondKey, T1.ThirdKey, T2.FirstKey, T2.SecondKey, T2.ThirdKey, T1.AdditionalColumns, T2.AdditionalColumns
HAVING COUNT(*) = 1;
一个建议
让您的整个系统选择一个子查询。让我们把它命名为SUBQ
然后你这样做:
SELECT *
FROM (SUBQ)
GROUP BY `ThirdKey`
HAVING COUNT(*) = 1