Sql 如何使用排序、分区和分组进行行编号
我需要对行进行排序、分区和分组。按IdDocument排序、日期更改、按IdDocument分区和按IdRole分组。问题尤其在于分组。从示例NumberingExpected Density_RANK可以看出,出于这个目的,它必须是最好的函数,但只有在用于排序的值相同时,它才会重复编号。在我的例子中,用于排序IdDocument的值、日期更改总是不同的,重复编号必须由IdRole完成 当然,使用光标可以很容易地解决这个问题。但是有没有办法通过编号/排名功能来实现呢 测试数据:Sql 如何使用排序、分区和分组进行行编号,sql,sql-server,sql-server-2008-r2,Sql,Sql Server,Sql Server 2008 R2,我需要对行进行排序、分区和分组。按IdDocument排序、日期更改、按IdDocument分区和按IdRole分组。问题尤其在于分组。从示例NumberingExpected Density_RANK可以看出,出于这个目的,它必须是最好的函数,但只有在用于排序的值相同时,它才会重复编号。在我的例子中,用于排序IdDocument的值、日期更改总是不同的,重复编号必须由IdRole完成 当然,使用光标可以很容易地解决这个问题。但是有没有办法通过编号/排名功能来实现呢 测试数据: declare
declare @LogTest as table (
Id INT
,IdRole INT
,DateChange DATETIME
,IdDocument INT
,NumberingExpected INT
)
insert into @LogTest
select 1 as Id, 7 as IdRole, GETDATE() as DateChange, 13 as IdDocument, 1 as NumberingExpected
union
select 2, 3, DATEADD(HH, 1, GETDATE()), 13, 2
union
select 3, 3, DATEADD(HH, 2, GETDATE()), 13, 2
union
select 4, 3, DATEADD(HH, 3, GETDATE()), 13, 2
union
select 5, 5, DATEADD(HH, 4, GETDATE()), 13, 3
union
select 7, 3, DATEADD(HH, 6, GETDATE()), 13, 4
union
select 6, 3, DATEADD(HH, 5, GETDATE()), 27, 1
union
select 8, 3, DATEADD(HH, 7, GETDATE()), 27, 1
union
select 9, 5, DATEADD(HH, 8, GETDATE()), 27, 2
union
select 10, 3, DATEADD(HH, 9, GETDATE()), 27, 3
select * from @LogTest order by IdDocument, DateChange;
功能编程方面的说明:
按IdDocument、DateChange排序的订单数据
将第一行编号设置为i=1转到下一行
如果文档已更改
{i=1;}
否则{
如果IdRow已更改{i++;}
}
设置行号为i;
去下一排;
如果EOF{exit;}否则{转到步骤3;}
这可能并不漂亮,但它确实创建了所需的输出
; with cte as (
select l.Id,l.IdRole,l.IdDocument,l.NumberingExpected,l.DateChange,
(select min(x.DateChange) from @LogTest x where x.IdDocument = l.IdDocument and x.IdRole = l.IdRole and x.id<=l.id and
x.id > (select max(y.id) from @LogTest y where y.IdDocument = l.IdDocument and y.IdRole <> l.IdRole and y.id <=l.Id)) as DateChange2
from @LogTest l
)
select c.Id,c.IdRole,c.DateChange,c.IdDocument,c.NumberingExpected,dense_rank() over (partition by c.IdDocument order by c.DateChange2) as rn
from cte c order by c.IdDocument, c.DateChange;
如果我有更多的时间,我认为CTE中的x.id谓词可以改进。自2012年以来,您可以使用滞后/超前,但在2008年它不可用,因此我们将模拟它。性能可能很差,您应该检查您的实际数据
这是最后一个查询:
WITH
CTE_rn
AS
(
SELECT
Main.IdRole
,Main.IdDocument
,Main.DateChange
,ROW_NUMBER() OVER(PARTITION BY Main.IdDocument ORDER BY Main.DateChange) AS rn
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
WHERE Main.IdRole <> Prev.IdRole OR Prev.IdRole IS NULL
)
SELECT *
FROM
@LogTest AS LT
CROSS APPLY
(
SELECT TOP(1) CTE_rn.rn
FROM CTE_rn
WHERE
CTE_rn.IdDocument = LT.IdDocument
AND CTE_rn.IdRole = LT.IdRole
AND CTE_rn.DateChange <= LT.DateChange
ORDER BY CTE_rn.DateChange DESC
) CA_rn
ORDER BY IdDocument, DateChange;
工作原理
1当表按IdDocument和DateChange排序时,我们需要上一行的IdRole值。为了得到它,我们使用外部应用,因为滞后不可用:
SELECT *
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
ORDER BY Main.IdDocument, Main.DateChange;
2我们希望删除具有重复IdRole的行,因此我们添加一个WHERE并对行进行编号。您可以看到,行号符合预期结果:
SELECT
Main.IdRole
,Main.IdDocument
,Main.DateChange
,ROW_NUMBER() OVER(PARTITION BY Main.IdDocument ORDER BY Main.DateChange) AS rn
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
WHERE Main.IdRole <> Prev.IdRole OR Prev.IdRole IS NULL
;
3最后,我们需要从CTE中为原始表的每一行获取正确的行号。我使用交叉应用从CTE中为原始表的每一行获取一行。您的抽取输出是什么?@Arion当输入是输出在除NumberingExpected之外的所有列上的投影时,这是所需的输出。@Arion,如果我正确理解了您的需求,那么我认为您无法使用内置的排名函数实现这一点。实际上,您希望按连续的DateChange值序列进行分区,而不是谨慎的日期更改值。我仍然认为,基于您提供的示例数据的预期输出将是我们所有人准确了解您试图获取的数据类型的最简单方法,这将很可能帮助您获得正确答案。SQL Server 2008 R2中不提供延迟。
SELECT *
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
ORDER BY Main.IdDocument, Main.DateChange;
Id IdRole DateChange IdDocument NumberingExpected IdRole
1 7 2015-01-26 20:50:32.560 13 1 NULL
2 3 2015-01-26 21:50:32.560 13 2 7
3 3 2015-01-26 22:50:32.560 13 2 3
4 3 2015-01-26 23:50:32.560 13 2 3
5 5 2015-01-27 00:50:32.560 13 3 3
7 3 2015-01-27 02:50:32.560 13 4 5
6 3 2015-01-27 01:50:32.560 27 1 NULL
8 3 2015-01-27 03:50:32.560 27 1 3
9 5 2015-01-27 04:50:32.560 27 2 3
10 3 2015-01-27 05:50:32.560 27 3 5
SELECT
Main.IdRole
,Main.IdDocument
,Main.DateChange
,ROW_NUMBER() OVER(PARTITION BY Main.IdDocument ORDER BY Main.DateChange) AS rn
FROM
@LogTest AS Main
OUTER APPLY
(
SELECT TOP (1) T.IdRole
FROM @LogTest AS T
WHERE
T.IdDocument = Main.IdDocument
AND T.DateChange < Main.DateChange
ORDER BY T.DateChange DESC
) AS Prev
WHERE Main.IdRole <> Prev.IdRole OR Prev.IdRole IS NULL
;
IdRole IdDocument DateChange rn
7 13 2015-01-26 20:13:26.247 1
3 13 2015-01-26 21:13:26.247 2
5 13 2015-01-27 00:13:26.247 3
3 13 2015-01-27 02:13:26.247 4
3 27 2015-01-27 01:13:26.247 1
5 27 2015-01-27 04:13:26.247 2
3 27 2015-01-27 05:13:26.247 3