Sql 将事务配对到日期范围行_Sql_Sql Server_Tsql_Sql Server 2014

Sql 将事务配对到日期范围行

sql sql-server tsql

Sql 将事务配对到日期范围行,sql,sql-server,tsql,sql-server-2014,Sql,Sql Server,Tsql,Sql Server 2014,我有一个如下结构的表，它显示了员工作为特定角色从帐户中添加（operation=I）或删除（operation=D）的时间 Account | Employee | Role | Operation | OperationTimestamp ABC | 1 | Rep | I | 1/1/2018 DEF | 1 | Mgr | I | 1/1/2018 ABC | 1 | Rep | D

我有一个如下结构的表，它显示了员工作为特定角色从帐户中添加（operation=I）或删除（operation=D）的时间

Account | Employee | Role | Operation | OperationTimestamp
ABC     | 1        | Rep  | I         | 1/1/2018
DEF     | 1        | Mgr  | I         | 1/1/2018
ABC     | 1        | Rep  | D         | 3/31/2018
ABC     | 1        | Rep  | I         | 7/1/2018
ABC     | 1        | Rep  | D         | 12/31/2018
ABC     | 2        | Mgr  | I         | 1/1/2018
DEF     | 2        | Exc  | I         | 1/1/2018
ABC     | 2        | Mgr  | D         | 3/31/2018
ABC     | 2        | Mgr  | I         | 6/1/2018
ABC     | 2        | Mgr  | D         | 10/31/2018

（I=插入，D=删除）

我需要开发一个查询，返回该员工在该帐户上的帐户、员工、角色和日期范围，如下所示：

Account | Employee | Role | StartingDate | EndingDate
ABC     | 1        | Rep  | 1/1/2018     | 3/31/2018
DEF     | 1        | Mgr  | 1/1/2018     | NULL
ABC     | 1        | Rep  | 7/1/2018     | 12/31/2018
ABC     | 2        | Mgr  | 1/1/2018     | 3/31/2018
DEF     | 2        | Exc  | 1/1/2018     | NULL
ABC     | 2        | Mgr  | 6/1/2018     | 10/31/2018

因此，从结果集中可以看到，如果员工已添加到帐户，但尚未删除，则EndingDate应为

NULL

我最担心的是，您可以多次和/或以多个角色从一个帐户中添加/删除同一名员工。我的直觉告诉我，我需要按帐户>员工>角色>日期对交易进行排序，并以某种方式将每两行分组在一起（因为它应该始终是一个I操作，后面是一个D操作），但我不确定如何处理“缺失”事务如果他们仍然在某个帐户上，则删除。

假设：对于相同的组合（帐户、员工、角色），一个

操作之后永远不会有另一个

；如果有下一行（可能不是该组合），则它始终是

数据：

如果上述为真，那么我将使用以下查询：

with
x as (
  select
    account, employee, role, operationtimestamp, operation,
    lead(operation) 
      over(partition by account, employee, role
           order by account, employee, role, operationtimestamp)
      as next_op,
    lead(operationtimestamp)
      over(partition by account, employee, role
           order by account, employee, role, operationtimestamp)
      as next_ts
  from my_table
),
y as(
  select
    account, employee, role,
    operationtimestamp as startingdate,
    next_ts as endingdate
  from x
  where operation = 'I'
)
select *
from y
order by employee, startingdate

结果:

account  employee  role  startingdate           endingdate           
-------  --------  ----  ---------------------  ---------------------
ABC      1         Rep   2018-01-01 00:00:00.0  2018-03-31 00:00:00.0
DEF      1         Mgr   2018-01-01 00:00:00.0  <null>               
ABC      1         Rep   2018-07-01 00:00:00.0  2018-12-31 00:00:00.0
ABC      2         Mgr   2018-01-01 00:00:00.0  2018-03-31 00:00:00.0
DEF      2         Exc   2018-01-01 00:00:00.0  <null>               
ABC      2         Mgr   2018-06-01 00:00:00.0  2018-10-31 00:00:00.0

账户员工角色开始日期结束日期
-------  --------  ----  ---------------------  ---------------------
ABC 1代表2018-01-01 00:00:00.0 2018-03-31 00:00:00.0
DEF 1经理2018-01-01 00:00:00.0
ABC 1代表2018-07-01 00:00:00.0 2018-12-31 00:00:00.0
ABC 2经理2018-01-01 00:00:00.0 2018-03-31 00:00:00.0
DEF 2 Exc 2018-01-01 00:00:00.0
ABC 2经理2018-06-01 00:00:00.0 2018-10-31 00:00:00.0

假设：对于同一组合（账户、员工、角色），一个

操作后不会再出现另一个

；如果有下一行（可能不是该组合），则它始终是

数据：

如果上述为真，那么我将使用以下查询：

with
x as (
  select
    account, employee, role, operationtimestamp, operation,
    lead(operation) 
      over(partition by account, employee, role
           order by account, employee, role, operationtimestamp)
      as next_op,
    lead(operationtimestamp)
      over(partition by account, employee, role
           order by account, employee, role, operationtimestamp)
      as next_ts
  from my_table
),
y as(
  select
    account, employee, role,
    operationtimestamp as startingdate,
    next_ts as endingdate
  from x
  where operation = 'I'
)
select *
from y
order by employee, startingdate

结果:

account  employee  role  startingdate           endingdate           
-------  --------  ----  ---------------------  ---------------------
ABC      1         Rep   2018-01-01 00:00:00.0  2018-03-31 00:00:00.0
DEF      1         Mgr   2018-01-01 00:00:00.0  <null>               
ABC      1         Rep   2018-07-01 00:00:00.0  2018-12-31 00:00:00.0
ABC      2         Mgr   2018-01-01 00:00:00.0  2018-03-31 00:00:00.0
DEF      2         Exc   2018-01-01 00:00:00.0  <null>               
ABC      2         Mgr   2018-06-01 00:00:00.0  2018-10-31 00:00:00.0

账户员工角色开始日期结束日期
-------  --------  ----  ---------------------  ---------------------
ABC 1代表2018-01-01 00:00:00.0 2018-03-31 00:00:00.0
DEF 1经理2018-01-01 00:00:00.0
ABC 1代表2018-07-01 00:00:00.0 2018-12-31 00:00:00.0
ABC 2经理2018-01-01 00:00:00.0 2018-03-31 00:00:00.0
DEF 2 Exc 2018-01-01 00:00:00.0
ABC 2经理2018-06-01 00:00:00.0 2018-10-31 00:00:00.0

带有

行号和自连接
这非常简单：
declare @t table(Account varchar(3), Employee int, EmpRole varchar(3), Operation varchar(1), OperationTimestamp datetime);
insert into @t values
 ('ABC',1,'Rep','I','20180101')
,('DEF',1,'Mgr','I','20180101')
,('ABC',1,'Rep','D','20180331')
,('ABC',1,'Rep','I','20180701')
,('ABC',1,'Rep','D','20181231')
,('ABC',2,'Mgr','I','20180101')
,('DEF',2,'Exc','I','20180101')
,('ABC',2,'Mgr','D','20180331')
,('ABC',2,'Mgr','I','20180601')
,('ABC',2,'Mgr','D','20181031');

with d as
(
    select Account
            ,Employee
            ,EmpRole
            ,Operation
            ,OperationTimestamp
            ,row_number() over (partition by Account, Employee, EmpRole order by OperationTimestamp) as ord
    from @t
)
select s.Account
    ,s.Employee
    ,s.EmpRole
    ,s.OperationTimestamp as OperationTimestampStart
    ,e.OperationTimestamp as OperationTimestampEnd
from d as s
    left join d as e
        on s.Account = e.Account
            and s.Employee = e.Employee
            and s.EmpRole = e.EmpRole
            and s.ord = e.ord-1
where s.Operation = 'I';

输出
使用行号
和self连接
非常简单：
declare @t table(Account varchar(3), Employee int, EmpRole varchar(3), Operation varchar(1), OperationTimestamp datetime);
insert into @t values
 ('ABC',1,'Rep','I','20180101')
,('DEF',1,'Mgr','I','20180101')
,('ABC',1,'Rep','D','20180331')
,('ABC',1,'Rep','I','20180701')
,('ABC',1,'Rep','D','20181231')
,('ABC',2,'Mgr','I','20180101')
,('DEF',2,'Exc','I','20180101')
,('ABC',2,'Mgr','D','20180331')
,('ABC',2,'Mgr','I','20180601')
,('ABC',2,'Mgr','D','20181031');

with d as
(
    select Account
            ,Employee
            ,EmpRole
            ,Operation
            ,OperationTimestamp
            ,row_number() over (partition by Account, Employee, EmpRole order by OperationTimestamp) as ord
    from @t
)
select s.Account
    ,s.Employee
    ,s.EmpRole
    ,s.OperationTimestamp as OperationTimestampStart
    ,e.OperationTimestamp as OperationTimestampEnd
from d as s
    left join d as e
        on s.Account = e.Account
            and s.Employee = e.Employee
            and s.EmpRole = e.EmpRole
            and s.ord = e.ord-1
where s.Operation = 'I';

输出
我想您只需要lead（）
或累计min（）
。我的意思是：
select account, employee, role, OperationTimestamp, EndingDate
from (select t.*,
             min(case when operation = 'D' then OperationTimestamp end) over
                 (partition by account, employee, role
                  order by OperationTimestamp desc
                 ) as EndingDate
      from t
     ) t
where operation = 'I';

我想您只需要lead（）
或累计min（）
。我的意思是：
select account, employee, role, OperationTimestamp, EndingDate
from (select t.*,
             min(case when operation = 'D' then OperationTimestamp end) over
                 (partition by account, employee, role
                  order by OperationTimestamp desc
                 ) as EndingDate
      from t
     ) t
where operation = 'I';

对于一个给定的角色，可以有多个连续的I，或者I后面总是紧跟着D？Gordon现在删除的答案是正确的，这确实是一个缺口和孤岛问题。你们都可以用它来测试你们的查询。@TimBiegeleisen，除非他们想要比问题中所述的更多的东西，我认为这比缺口和孤岛问题要简单得多。只需为每个账户|员工|角色
组合查找紧跟在I
记录之后的单个D
记录。是的，但您如何做到这一点呢？嗯……我想说的是，对于同一账户/emp/角色，假设D紧跟I是安全的，但我现在发现了一些奇怪的地方，那里有D，没有原始的I。我相信这是在数据库迁移过程中发生的（当员工最初加入帐户时没有I，因为他们作为迁移的一部分被附加到帐户），但当他们第一次离开时，会出现一个D事务。因此，我可能要处理一个更复杂的场景。不激动……对于一个给定的角色，可以有多个连续的I，或者I后面总是紧跟着D？戈登现在删除的答案是正确的，这确实是差距和孤岛问题。你们都可以用它来测试你们的查询。@TimBiegeleisen，除非他们想要比问题中所述的更多的东西，我认为这比缺口和孤岛问题要简单得多。只需为每个账户|员工|角色
组合查找紧跟在I
记录之后的单个D
记录。是的，但您如何做到这一点呢？嗯……我想说的是，对于同一账户/emp/角色，假设D紧跟I是安全的，但我现在发现了一些奇怪的地方，那里有D，没有原始的I。我相信这是在数据库迁移过程中发生的（当员工最初加入帐户时没有I，因为他们作为迁移的一部分被附加到帐户），但当他们第一次离开时，会出现一个D事务。因此，我可能要处理一个更复杂的场景。不激动……你不需要你的y
cte
在那里。只需在第二个语句的末尾按

排序即可。我没有一个很好的测试环境，所以你知道LEAD（）与@iamdave的答案中的左连接相比是如何执行的吗？我倾向于认为，

LEAD（）

是通过“排序”操作执行的，而左连接是通过NLS（或哈希连接，或合并连接）执行的。我认为分拣操作应该更快。要真正找到答案，您需要检索两个查询的执行计划并对它们进行比较。SQL优化器有时是正确的，而其他时候则不是那么好。无论如何，比较一下这两个计划的成本。显然，即使我们使用的是SQL2014，兼容性级别已经设置好了