Sql server 删除基于日期的时间线中连续重复出现的值
我有一个表,其中包含基于日期的用户操作。该表用作事件的时间线。下面的示例显示了两个人如何随着时间的推移改变他们的工作角色:Sql server 删除基于日期的时间线中连续重复出现的值,sql-server,tsql,sql-server-2012,Sql Server,Tsql,Sql Server 2012,我有一个表,其中包含基于日期的用户操作。该表用作事件的时间线。下面的示例显示了两个人如何随着时间的推移改变他们的工作角色: DECLARE @tbl TABLE ( UserID int, ActionID int, ActionDesc nvarchar(50), ActionDate datetime ); INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate) VALUES -- Fi
DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'),
(2, 200, 'Promoted', '2008-01-01');
SELECT * FROM @tbl ORDER BY UserID, ActionDate DESC;
这将首先显示以下最新事件:
我需要以相反的日期顺序显示该表,但根据[UserID/ActionID]匹配,删除在刚刚发生之后直接发生的任何事件。例如,如果此人被提升,然后紧接着再次提升,则第二次提升将不包括在结果中,因为它将被视为前一次操作的重复
因此,所需输出为:
在研究之后,我尝试获取行编号()
,以识别重复项:
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY UserID, ActionID ORDER BY ActionDate ASC) AS RowNum
FROM
@tbl
ORDER BY
UserID, ActionDate DESC;
…但它不太起作用,因为每次不同的操作后编号都不会重置。我可能想得太多了,但我正在努力寻找灵感,因为搜索结果返回了无数问题,人们只是从列表中删除重复项。DECLARE@tbl TABLE(
DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'), --<<--- here ActionID is 300
(2, 200, 'Promoted', '2008-01-01');
select UserID, ActionID, ActionDesc, min(ActionDate) as dt
from (
select t.*
, row_number() over(partition by UserID, ActionID order by ActionDate)
- row_number() over(partition by UserID order by ActionDate) as grp_id
from @tbl t
) v
group by grp_id, UserID, ActionID, ActionDesc
order by UserID, min(ActionDate) desc;
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate日期时间
);
插入@tbl(UserID、ActionID、ActionDesc、ActionDate)
价值观
--第一人称
(1200,“晋升”,“2000-01-01”),
(1200,“晋升”,“2001-01-01”),
(1200,“晋升”,“2002-02-01”),
(1300,“已移动”,“2004-03-01”),
(1200,“晋升”,“2005-03-01”),
(1200,“晋升”,“2006-03-01”),
--第二人称
(2200,“晋升”,“2006-01-01”),
(2300,“移动”,“2007-01-01”)--
第一个(内部)查询为每一行分配一个行号,按userid和actiondate排序-然后我计算一个行号,与之相同,但也按“action”分区-如果我从a中选择子动作B,我得到一个只能应用于一组userid和Actions的数字-通过生成另一个行号,按userid、actionId分区,然后,我可以选择行1,最早的日期。我将使用它来消除不必要的行
USE tempdb;
DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'),
(2, 200, 'Promoted', '2008-01-01');
;WITH src AS
(
SELECT *
, l = LEAD(t.ActionID) OVER (PARTITION BY t.UserID ORDER BY t.ActionDate DESC)
FROM @tbl t
)
SELECT src.UserID
, src.ActionID
, src.ActionDesc
, src.ActionDate
FROM src
WHERE src.l <> src.ActionID
OR src.l IS NULL
对于具有大量行的表,您希望尽可能减少查询中使用的聚合数量;LEAD只需要一个聚合就可以实现这一点。我的版本的执行计划:
Argh。这是我实际测试代码中的一个输入错误。是的,应该是300。我已经更新了OP。你能解释一下你在代码的每一步都在做什么吗?看起来你在用一组行号减去另一组行号??太棒了!你能解释一下代码里发生了什么吗?我从来没有想到,如果你剥开外层,你会看到我是如何努力朝着最终目标构建的。用常用的表表达式而不是子查询写这篇文章可能会更容易理解。就在那里——这让我的日子过得很愉快!如此简单,却又如此强大。每隔一段时间(比如现在),我都会学到一些关于TSQL的新东西,我可以在很多其他地方使用这些新东西来让生活变得更轻松,但我想知道如果不在so上发布,我是如何学会的。。。!!非常感谢你。我的荣幸,是的,斯塔克福也改变了我的生活!
USE tempdb;
DECLARE @tbl TABLE (
UserID int,
ActionID int,
ActionDesc nvarchar(50),
ActionDate datetime
);
INSERT INTO @tbl (UserID, ActionID, ActionDesc, ActionDate)
VALUES
-- First person
(1, 200, 'Promoted', '2000-01-01'),
(1, 200, 'Promoted', '2001-01-01'),
(1, 200, 'Promoted', '2002-02-01'),
(1, 300, 'Moved', '2004-03-01'),
(1, 200, 'Promoted', '2005-03-01'),
(1, 200, 'Promoted', '2006-03-01'),
-- Second person
(2, 200, 'Promoted', '2006-01-01'),
(2, 300, 'Moved', '2007-01-01'),
(2, 200, 'Promoted', '2008-01-01');
;WITH src AS
(
SELECT *
, l = LEAD(t.ActionID) OVER (PARTITION BY t.UserID ORDER BY t.ActionDate DESC)
FROM @tbl t
)
SELECT src.UserID
, src.ActionID
, src.ActionDesc
, src.ActionDate
FROM src
WHERE src.l <> src.ActionID
OR src.l IS NULL
╔════════╦══════════╦════════════╦═════════════════════════╗
║ UserID ║ ActionID ║ ActionDesc ║ ActionDate ║
╠════════╬══════════╬════════════╬═════════════════════════╣
║ 1 ║ 200 ║ Promoted ║ 2005-03-01 00:00:00.000 ║
║ 1 ║ 300 ║ Moved ║ 2004-03-01 00:00:00.000 ║
║ 1 ║ 200 ║ Promoted ║ 2000-01-01 00:00:00.000 ║
║ 2 ║ 200 ║ Promoted ║ 2008-01-01 00:00:00.000 ║
║ 2 ║ 300 ║ Moved ║ 2007-01-01 00:00:00.000 ║
║ 2 ║ 200 ║ Promoted ║ 2006-01-01 00:00:00.000 ║
╚════════╩══════════╩════════════╩═════════════════════════╝