Sql 从日期列表中总结日期
我有一个表,它有一个id列和validity start和end date列。 每个id都有多个有效日期范围 我希望尽可能减少记录,并尽可能为每组连续日期创建一行Sql 从日期列表中总结日期,sql,sql-server,tsql,Sql,Sql Server,Tsql,我有一个表,它有一个id列和validity start和end date列。 每个id都有多个有效日期范围 我希望尽可能减少记录,并尽可能为每组连续日期创建一行 declare @tbl table (cid int, st_date int, end_date int ) insert into @tbl (cid, st_date,end_date) values (1,20190110,20190111), (1,20190111,20190
declare @tbl table (cid int, st_date int, end_date int )
insert into @tbl (cid, st_date,end_date)
values (1,20190110,20190111),
(1,20190111,20190117),
(1,20190117,20190123),
(2,20190101,20190117),
(2,20190119,20190123),
(2,20190123,20190127)
所需输出:
cid st_date end_date
1 20190110 20190123
2 20190101 20190117
2 20190119 20190127
对于每个cid,您将生成最早的开始日期和最晚的结束日期
SELECT cid, MIN(st_date) as st_date, MAX(end_date) as end_date
FROM @tbl
GROUP BY cid
除了使用游标,我找不到其他选项。现在,您可以使用新提供的数据检查下面列出的新输出。对我来说,输出似乎是正确的
DECLARE @tbl TABLE (cid INT, st_date INT, end_date INT )
DECLARE @tmp TABLE (cid INT, st_date INT, end_date INT, gid INT )
DECLARE @gid INT
SET @gid = 1
INSERT INTO @tbl (cid, st_date,end_date)
VALUES
(1,20190110,20190111),
(1,20190111,20190117),
(1,20190117,20190123),
(2,20190101,20190117),
(2,20190119,20190123),
(2,20190123,20190127),
(2,20190201,20190205),
(2,20190205,20190210)
DECLARE @cid INT, @st_date INT, @end_date INT,@B_cid INT,@B_st_date INT
DECLARE vendor_cursor CURSOR FOR
SELECT A.cid,A.st_date,A.end_date,B.cid B_cid,B.st_date B_st_date
FROM
(
SELECT ROW_NUMBER() OVER (ORDER BY CID,st_date,end_date) RN,*
FROM
(
SELECT CID,CONVERT(varchar, st_date, 23) st_date,CONVERT(varchar, end_date, 23) end_date FROM @tbl
)X
)A
LEFT JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY CID,st_date,end_date)-1 RN,*
FROM
(
SELECT CID,CONVERT(varchar, st_date, 23) st_date,CONVERT(varchar, end_date, 23) end_date FROM @tbl
)Y
)B
ON A.RN = B.RN
ORDER BY 1,2
OPEN vendor_cursor
FETCH NEXT FROM vendor_cursor
INTO @cid, @st_date,@end_date,@B_cid,@B_st_date
WHILE @@FETCH_STATUS = 0
BEGIN
IF (@cid = @B_cid) AND (@end_date = @B_st_date)
BEGIN
INSERT INTO @tmp (cid, st_date,end_date,gid)
VALUES
(@cid,@st_date,@end_date,@gid)
END
ELSE
BEGIN
INSERT INTO @tmp (cid, st_date,end_date,gid)
VALUES
(@cid,@st_date,@end_date,@gid)
SET @gid = @gid +1
END
FETCH NEXT FROM vendor_cursor
INTO @cid, @st_date,@end_date,@B_cid,@B_st_date
END
CLOSE vendor_cursor;
DEALLOCATE vendor_cursor
SELECT cid,
MIN(st_date) st_date,
MAX(end_date) end_date
FROM @tmp
GROUP BY cid,gid
新的输出是-
cid st_date end_date
1 20190110 20190123
2 20190101 20190117
2 20190119 20190127
2 20190201 20190210
这是一个缺口和孤岛问题。但它正在处理(潜在的)重叠区间。对于一般解决方案,我建议:
select cid, min(st_date) as st_date, max(end_date) as end_date
from (select t.*,
sum(case when max_prev_ed >= st_date then 0 else 1 end) over (partition by cid order by st_date) as grp
from (select t.*, max(end_date) over (partition by cid order by st_date rows between unbounded preceding and 1 preceding) as max_prev_ed
from @tbl t
) t
) t
group by cid, grp;
他是一把小提琴
对于以下情况,这是一个稳健的解决方案:
- 超过一天的重叠
- 将一个间隔完全包含在另一个间隔中