SQL server执行类似分组任务的操作
我有一个带有SQL server的表,如下所示SQL server执行类似分组任务的操作,sql,sql-server,sql-server-2014,Sql,Sql Server,Sql Server 2014,我有一个带有SQL server的表,如下所示 Date Value --------------------------------------------------- 08-01-2016 1 08-02-2016 1 08-03-2016 1 08-04-2016 1 08-05-2016 1 08-06-2016 2 08-07-2016 2 08-08-2016 2 08-09-2016 2.5 08-10-2016
Date Value
---------------------------------------------------
08-01-2016 1
08-02-2016 1
08-03-2016 1
08-04-2016 1
08-05-2016 1
08-06-2016 2
08-07-2016 2
08-08-2016 2
08-09-2016 2.5
08-10-2016 1
08-11-2016 1
由于原始表太大,即使我使用了“Results to file”,它仍然会引发异常“System.OutOfMemoryException”。这就是为什么我想把桌子组织成这样
但我没有一个好的逻辑来处理。因此,我想将表更改为如下所示的类型
Date_from Date_to Value
-------------------------------------------------
08-01-2016 08-05-2016 1
08-06-2016 08-08-2016 2
08-09-2016 08-09-2016 2.5
08-10-2016 08-11-2016 1
我很欣赏你的想法 通常称为群岛问题。这里有一个技巧可以做到这一点
;WITH data
AS (SELECT *,Lag(Value, 1)OVER(ORDER BY Dates) [pVal]
FROM (VALUES ('08-01-2016',1 ),
('08-02-2016',1 ),
('08-03-2016',1 ),
('08-04-2016',1 ),
('08-05-2016',1 ),
('08-06-2016',2 ),
('08-07-2016',2 ),
('08-08-2016',2 ),
('08-09-2016',2.5 ),
('08-10-2016',1 ),
('08-11-2016',1 )) tc (Dates, Value)),
intr
AS (SELECT Dates,
Value,
Sum(Iif(pVal = Value, 0, 1)) OVER(ORDER BY Dates) AS [Counter]
FROM data)
SELECT Min(Dates) AS Dates_from,
Max(Dates) AS Dates_to,
Value
FROM intr
GROUP BY [Counter],
Value
累积和/滞后法是一种方法。在这种情况下,更简单的方法是:
select min(date) as date_from, max(date) as date_to, value
from (select t.*,
dateadd(day, - row_number() over (partition by value order by date),date) as grp
from t
) t
group by value, grp;
这使用了日期连续且没有间隔的观察结果。因此,从日期中减去一个序列将产生一个常数——当值相同时。以下是一个示例:
DECLARE @T TABLE (
[Date] DATE,
[Value] DECIMAL(9,2)
)
INSERT @T VALUES
( '08-01-2016', 1 ),
( '08-02-2016', 1 ),
( '08-03-2016', 1 ),
( '08-04-2016', 1 ),
( '08-05-2016', 1 ),
( '08-06-2016', 2 ),
( '08-07-2016', 2 ),
( '08-08-2016', 2 ),
( '08-09-2016', 2.5 ),
( '08-10-2016', 1 ),
( '08-11-2016', 1 )
SELECT * FROM @T
SELECT A.[Date] StartDate, B.[Date] EndDate, A.[Value] FROM (
SELECT A.*, ROW_NUMBER() OVER (ORDER BY A.[Date], A.[Value]) O FROM @T A
LEFT JOIN @T B ON B.[Value] = A.[Value] AND B.[Date] = DATEADD(d, -1, A.[Date])
WHERE B.[Date] IS NULL
) A
JOIN (
SELECT A.*, ROW_NUMBER() OVER (ORDER BY A.[Date], A.[Value]) O FROM @T A
LEFT JOIN @T B ON B.[Value] = A.[Value] AND B.[Date] = DATEADD(d, 1, A.[Date])
WHERE B.[Date] IS NULL
) B ON B.O = A.O
Prdp的解决方案非常好,但如果有人仍在使用SQL Server 2008,而在SQL Server 2008中,LAG()和并行数据仓库(PDW)功能不可用,那么这里有一个替代方案:
样本数据:
IF OBJECT_ID('tempdb..#Temp') IS NOT NULL
DROP TABLE #Temp;
CREATE TABLE #Temp([Dates] DATE
, [Value] FLOAT);
INSERT INTO #Temp([Dates]
, [Value])
VALUES
('08-01-2016'
, 1),
('08-02-2016'
, 1),
('08-03-2016'
, 1),
('08-04-2016'
, 1),
('08-05-2016'
, 1),
('08-06-2016'
, 2),
('08-07-2016'
, 2),
('08-08-2016'
, 2),
('08-09-2016'
, 2.5),
('08-10-2016'
, 1),
('08-11-2016'
, 1);
查询:
;WITH Seq
AS (SELECT SeqNo = ROW_NUMBER() OVER(ORDER BY [Dates]
, [Value])
, t.Dates
, t.[Value]
FROM #Temp t)
SELECT StartDate = MIN([Dates])
, EndDate = MAX([Dates])
, [Value]
FROM
(SELECT [Value]
, [Dates]
, SeqNo
, rn = SeqNo - ROW_NUMBER() OVER(PARTITION BY [Value] ORDER BY SeqNo)
FROM Seq s) a
GROUP BY [Value]
, rn
ORDER BY StartDate;
结果:
你能解释一下你如何分组数据的逻辑吗?你使用的是哪个版本的sql server?@Prdp sql server 2014管理研究你在应用程序(VB或C)中使用这个吗?在代码中处理此问题可以正常工作,并且不会引发“内存不足”异常。否则,您将需要一个存储过程,在该过程中,您将遍历所有记录以生成结果。@DForck42您是指我如何得到这种结果的?我不能,所以我向你们征求意见~谢谢~@Gordon Linoff-对不起,我声明值数据类型为INT,现在我声明值数据类型为decimal,然后进行检查。它工作正常。@Mansoor。对我来说,这似乎是最简单的解决办法。