Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/sql-server/25.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 从日期列表中总结日期_Sql_Sql Server_Tsql - Fatal编程技术网

Sql 从日期列表中总结日期

Sql 从日期列表中总结日期,sql,sql-server,tsql,Sql,Sql Server,Tsql,我有一个表,它有一个id列和validity start和end date列。 每个id都有多个有效日期范围 我希望尽可能减少记录,并尽可能为每组连续日期创建一行 declare @tbl table (cid int, st_date int, end_date int ) insert into @tbl (cid, st_date,end_date) values (1,20190110,20190111), (1,20190111,20190

我有一个表,它有一个id列和validity start和end date列。 每个id都有多个有效日期范围

我希望尽可能减少记录,并尽可能为每组连续日期创建一行

    declare @tbl table (cid int, st_date int, end_date int )


    insert into @tbl  (cid, st_date,end_date)  
    values (1,20190110,20190111),  
    (1,20190111,20190117), 
    (1,20190117,20190123), 
    (2,20190101,20190117), 
    (2,20190119,20190123),
    (2,20190123,20190127)
所需输出:

cid    st_date      end_date

  1    20190110     20190123

  2    20190101     20190117

  2    20190119     20190127

对于每个cid,您将生成最早的开始日期和最晚的结束日期

SELECT cid, MIN(st_date) as st_date, MAX(end_date) as end_date
FROM @tbl
GROUP BY cid

除了使用游标,我找不到其他选项。现在,您可以使用新提供的数据检查下面列出的新输出。对我来说,输出似乎是正确的

DECLARE @tbl TABLE (cid INT, st_date INT, end_date INT )
DECLARE @tmp TABLE (cid INT, st_date INT, end_date INT, gid INT )

DECLARE @gid INT
SET @gid = 1

INSERT INTO @tbl (cid, st_date,end_date) 
VALUES 
(1,20190110,20190111), 
(1,20190111,20190117), 
(1,20190117,20190123), 
(2,20190101,20190117), 
(2,20190119,20190123), 
(2,20190123,20190127), 
(2,20190201,20190205), 
(2,20190205,20190210)

DECLARE @cid INT, @st_date INT, @end_date INT,@B_cid INT,@B_st_date INT

DECLARE vendor_cursor CURSOR FOR   
SELECT A.cid,A.st_date,A.end_date,B.cid B_cid,B.st_date B_st_date
FROM
(
    SELECT ROW_NUMBER() OVER (ORDER BY CID,st_date,end_date) RN,* 
    FROM 
    (
        SELECT CID,CONVERT(varchar, st_date, 23) st_date,CONVERT(varchar, end_date, 23) end_date FROM @tbl
    )X
)A
LEFT JOIN (
    SELECT ROW_NUMBER() OVER (ORDER BY CID,st_date,end_date)-1 RN,* 
    FROM 
    (
        SELECT CID,CONVERT(varchar, st_date, 23) st_date,CONVERT(varchar, end_date, 23) end_date FROM @tbl
    )Y
)B
ON A.RN = B.RN
ORDER BY 1,2

OPEN vendor_cursor  
FETCH NEXT FROM vendor_cursor   
INTO @cid, @st_date,@end_date,@B_cid,@B_st_date  

WHILE @@FETCH_STATUS = 0  
BEGIN

    IF  (@cid = @B_cid) AND (@end_date = @B_st_date)
    BEGIN

        INSERT INTO @tmp (cid, st_date,end_date,gid) 
        VALUES 
        (@cid,@st_date,@end_date,@gid)

    END

    ELSE
    BEGIN
        INSERT INTO @tmp (cid, st_date,end_date,gid) 
        VALUES 
        (@cid,@st_date,@end_date,@gid)

        SET @gid = @gid +1
    END

    FETCH NEXT FROM vendor_cursor   
    INTO @cid, @st_date,@end_date,@B_cid,@B_st_date   
END   
CLOSE vendor_cursor;  
DEALLOCATE vendor_cursor

SELECT cid,
MIN(st_date) st_date,
MAX(end_date) end_date
FROM @tmp
GROUP BY cid,gid
新的输出是-

cid st_date     end_date
1   20190110    20190123
2   20190101    20190117
2   20190119    20190127
2   20190201    20190210

这是一个缺口和孤岛问题。但它正在处理(潜在的)重叠区间。对于一般解决方案,我建议:

select cid, min(st_date) as st_date, max(end_date) as end_date
from (select t.*,
             sum(case when max_prev_ed >= st_date then 0 else 1 end) over (partition by cid order by st_date) as grp
      from (select t.*, max(end_date) over (partition by cid order by st_date rows between unbounded preceding and 1 preceding) as max_prev_ed
            from @tbl t
           ) t
     ) t
group by cid, grp;
他是一把小提琴

对于以下情况,这是一个稳健的解决方案:

  • 超过一天的重叠
  • 将一个间隔完全包含在另一个间隔中

否,只有在cid 1的日期顺序正确的情况下,才可以正常工作,但对于cid 2,有效期间隔为一天。我编辑了这个例子,请看。我会继续做一点,但你基本上遇到了“缺口和孤岛”的问题。没有一个简单的解决方案,但我建议帮助了解发生了什么以及如何修复它。对于初学者,我强烈建议使用日期数据类型来存储日期,而不是整数。然后我建议查看这篇文章,它解释了如何在类似这样的连续数据上对孤岛进行分组。请使用正确的数据类型。使用int或varchar作为日期列的类型会给您带来噩梦。请尝试使用此示例值,而将插入@tbl(cid、st_日期、结束日期)值(12019011020190111)、(12019011120190117)、(12019011720190123)、(22019010120190117)、(22019011920190123)、(22019012320190127)、(22019020190205)、(220190205201210)中答案中添加了新的输出。请检查。检测到一些问题。让我们检查一下我是否能处理。使用游标更改脚本这可能会产生所需的结果,但游标在这里是一种非常低效的方法。演出很快就要失败了。当基于集合的选项可用时,应避免使用光标。