SQL中唯一的列对
我有4列,如下所示:SQL中唯一的列对,sql,vertica,Sql,Vertica,我有4列,如下所示: COL1 COL1_TIME COL2 COL2_TIME A 09:20:00 E 09:35:00 A 09:20:00 F 09:36:00 A 09:20:00 G 09:40:00 A 09:20:00 H 09:59:00 B 09:25:00 E 09:35:00 B 09:25:00 F 09:36:00 B
COL1 COL1_TIME COL2 COL2_TIME
A 09:20:00 E 09:35:00
A 09:20:00 F 09:36:00
A 09:20:00 G 09:40:00
A 09:20:00 H 09:59:00
B 09:25:00 E 09:35:00
B 09:25:00 F 09:36:00
B 09:25:00 G 09:40:00
B 09:25:00 H 09:59:00
C 09:30:00 E 09:35:00
C 09:30:00 F 09:36:00
C 09:30:00 G 09:40:00
C 09:30:00 H 09:59:00
D 09:50:00 H 09:59:00
我必须从COL1和COL2列中选择唯一的值对。要找到一对,你应该选择最接近COL1\u时间的COL2\u时间
所以A的最短时间是E,B的F-E已经被取了,等等
结果应该如下所示:
A E
B F
C G
D H
有什么想法吗?如果COL1和COL2不同值的基数始终为1-1,并且不存在其他特殊情况/例外,则可以执行以下操作:
with temp1 as (
select col1
,col1_time
,row_number() over (partition by col1 order by col1 desc) as rownum1
), temp2 as(
select col2
,col2_time
,row_number() over (partition by col2 order by col2 desc) as rownum2
)
select distinct(temp1.col1)
,distinct(temp2.col2)
from temp1,temp2
where temp1.rownum1 = temp2.rownum2
好吧,如果没有递归的公共表表达式,您需要硬连接一些东西。 如果COL1有4个以上的值,它会变得更加乏味;如果它是一个非常重要的商业问题,请考虑写一个UDX。 但是-否则-这里有一个有效的方法-输入包含在WITH子句的第一个公共表表达式中:
WITH
input(col1,col1_time,col2,col2_time) AS (
SELECT 'A',TIME '09:20:00','E',TIME '09:35:00'
UNION ALL SELECT 'A',TIME '09:20:00','F',TIME '09:36:00'
UNION ALL SELECT 'A',TIME '09:20:00','G',TIME '09:40:00'
UNION ALL SELECT 'A',TIME '09:20:00','H',TIME '09:59:00'
UNION ALL SELECT 'B',TIME '09:25:00','E',TIME '09:35:00'
UNION ALL SELECT 'B',TIME '09:25:00','F',TIME '09:36:00'
UNION ALL SELECT 'B',TIME '09:25:00','G',TIME '09:40:00'
UNION ALL SELECT 'B',TIME '09:25:00','H',TIME '09:59:00'
UNION ALL SELECT 'C',TIME '09:30:00','E',TIME '09:35:00'
UNION ALL SELECT 'C',TIME '09:30:00','F',TIME '09:36:00'
UNION ALL SELECT 'C',TIME '09:30:00','G',TIME '09:40:00'
UNION ALL SELECT 'C',TIME '09:30:00','H',TIME '09:59:00'
UNION ALL SELECT 'D',TIME '09:50:00','H',TIME '09:59:00'
)
,
col1_A AS (
SELECT
col1
, col2
FROM input
WHERE col1='A'
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
,
col1_B AS (
SELECT
col1
, col2
FROM input
WHERE col1='B'
AND col2 NOT IN (
SELECT col2 FROM col1_A
)
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
,
col1_C AS (
SELECT
col1
, col2
FROM input
WHERE col1='C'
AND col2 NOT IN (
SELECT col2 FROM col1_A
UNION ALL SELECT col2 FROM col1_B
)
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
,
col1_D AS (
SELECT
col1
, col2
FROM input
WHERE col1='D'
AND col2 NOT IN (
SELECT col2 FROM col1_A
UNION ALL SELECT col2 FROM col1_B
UNION ALL SELECT col2 FROM col1_C
)
ORDER BY ABS(TIMESTAMPDIFF('SECOND',col1_time::TIMESTAMP,col2_time::TIMESTAMP))
LIMIT 1
)
SELECT * FROM col1_A
UNION ALL SELECT * FROM col1_B
UNION ALL SELECT * FROM col1_C
UNION ALL SELECT * FROM col1_D
;
如果这不是你所希望的,我不会感到惊讶
玩得开心
理智的马可我很确定这需要递归CTE,我认为Vertica不支持这些。@GordonLinoff:你能为sql server编写查询吗