Tsql 如何计算SQL中的最长条纹?

Tsql 如何计算SQL中的最长条纹?,tsql,sql-server-2008,Tsql,Sql Server 2008,我有 我想计算员工出现的最长连续时间。当前位在他未出现的天数内为假。因此,我想计算他连续出现的最长天数。我有一个日期列字段是唯一的…因此我尝试了这种方法- TABLE EMPLOYEE - ID,DATE,IsPresent 但是上面的方法不起作用……有人能告诉我如何计算这个的streak吗?……我确信人们都遇到过这个问题……我试过在网上搜索,但……不太明白……请帮我解决……试试这个: Select Id,Count(*) from Employee where IsPresent=1

我有

我想计算员工出现的最长连续时间。当前位在他未出现的天数内为假。因此,我想计算他连续出现的最长天数。我有一个日期列字段是唯一的…因此我尝试了这种方法-

  TABLE EMPLOYEE - ID,DATE,IsPresent
但是上面的方法不起作用……有人能告诉我如何计算这个的streak吗?……我确信人们都遇到过这个问题……我试过在网上搜索,但……不太明白……请帮我解决……试试这个:

Select Id,Count(*) from Employee where IsPresent=1
然后找出每位员工的最长连续记录:

select 
    e.Id,
    e.date,
    (select 
       max(e1.date) 
     from 
       employee e1 
     where 
       e1.Id = e.Id and
       e1.date < e.date and 
       e1.IsPresent = 0) StreakStartDate,
    (select 
       min(e2.date) 
     from 
       employee e2 
     where 
       e2.Id = e.Id and
       e2.date > e.date and
       e2.IsPresent = 0) StreakEndDate           
from 
    employee e
where
    e.IsPresent = 1
我不能完全确定这个查询的语法是否正确,因为我现在还没有数据库。 另外请注意,“条纹开始”和“条纹结束”列不包含员工出席的第一天和最后一天,而是最近的员工缺席日期。若表中的日期具有大致相等的距离,这并不意味着,否则查询会变得稍微复杂一些,因为我们需要找出最近的存在日期。此外,这种改进允许处理最长条纹为第一条或最后一条条纹的情况

其主要思想是在员工在场的每个日期找出连续的开始和结束


对于员工在场时表中的每一行,streak start是最大日期,该日期小于员工缺席时当前行的日期

我这样做了一次,以确定连续几天消防队员值班至少15分钟

你的情况简单一点

如果您想假设没有员工连续出现超过32次,您可以使用一个公共表表达式。但更好的方法是使用临时表和while循环

您需要一个名为StartingRowID的列。在下一个连续的员工工作日,继续从临时表连接到employeeWorkDay表,并将其插入临时表。当@@Row\u Count=0时,您已捕捉到最长的条纹

现在通过StartingRowID进行聚合,获得最长连胜的第一天。我的时间不多了,或者我会提供一些示例代码。

编辑以下是SQL Server版本的查询:

select id, max(datediff(streakStartDate, streakEndDate))
from (<use subquery above>)
group by id
测试数据的SQL Server版本:

with LowerBound as (select second_day.EmployeeId
        , second_day."DATE" as LowerDate
        , row_number() over (partition by second_day.EmployeeId 
            order by second_day."DATE") as RN
    from T second_day
    left outer join T first_day
        on first_day.EmployeeId = second_day.EmployeeId
        and first_day."DATE" = dateadd(day, -1, second_day."DATE")
        and first_day.IsPresent = 1
    where first_day.EmployeeId is null
    and second_day.IsPresent = 1)
, UpperBound as (select first_day.EmployeeId
        , first_day."DATE" as UpperDate
        , row_number() over (partition by first_day.EmployeeId 
            order by first_day."DATE") as RN
    from T first_day
    left outer join T second_day
        on first_day.EmployeeId = second_day.EmployeeId
        and first_day."DATE" = dateadd(day, -1, second_day."DATE")
        and second_day.IsPresent = 1
    where second_day.EmployeeId is null
    and first_day.IsPresent = 1)
select LB.EmployeeID, max(datediff(day, LowerDate, UpperDate) + 1) as LongestStreak
from LowerBound LB
inner join UpperBound UB
    on LB.EmployeeId = UB.EmployeeId
    and LB.RN = UB.RN
group by LB.EmployeeId
很抱歉,这是用Oracle编写的,因此请替换相应的SQL Server日期算法

假设:

日期可以是日期值,也可以是 带有时间组件的DateTime 00:00:00. 主键是 雇员ID,日期 所有字段都不是空的 如果员工缺少日期,则表示他们不在场。用于处理数据序列的开始和结束,但也意味着中间缺失的日期将中断条纹。这可能是一个问题,具体取决于需求

create table T (EmployeeId int
    , "DATE" date not null
    , IsPresent bit not null 
    , constraint T_PK primary key (EmployeeId, "DATE")
)


insert into T values (1, '2000-01-01', 1);
insert into T values (2, '2000-01-01', 0);
insert into T values (3, '2000-01-01', 0);
insert into T values (3, '2000-01-02', 1);
insert into T values (3, '2000-01-03', 1);
insert into T values (3, '2000-01-04', 0);
insert into T values (3, '2000-01-05', 1);
insert into T values (3, '2000-01-06', 1);
insert into T values (3, '2000-01-07', 0);
insert into T values (4, '2000-01-01', 0);
insert into T values (4, '2000-01-02', 1);
insert into T values (4, '2000-01-03', 1);
insert into T values (4, '2000-01-04', 1);
insert into T values (4, '2000-01-05', 1);
insert into T values (4, '2000-01-06', 1);
insert into T values (4, '2000-01-07', 0);
insert into T values (5, '2000-01-01', 0);
insert into T values (5, '2000-01-02', 1);
insert into T values (5, '2000-01-03', 0);
insert into T values (5, '2000-01-04', 1);
insert into T values (5, '2000-01-05', 1);
insert into T values (5, '2000-01-06', 1);
insert into T values (5, '2000-01-07', 0);
测试数据:

with LowerBound as (select second_day.EmployeeId
        , second_day."DATE" as LowerDate
        , row_number() over (partition by second_day.EmployeeId 
            order by second_day."DATE") as RN
    from T second_day
    left outer join T first_day
        on first_day.EmployeeId = second_day.EmployeeId
        and first_day."DATE" = second_day."DATE" - 1
        and first_day.IsPresent = 1
    where first_day.EmployeeId is null
    and second_day.IsPresent = 1)
, UpperBound as (select first_day.EmployeeId
        , first_day."DATE" as UpperDate
        , row_number() over (partition by first_day.EmployeeId 
            order by first_day."DATE") as RN
    from T first_day
    left outer join T second_day
        on first_day.EmployeeId = second_day.EmployeeId
        and first_day."DATE" = second_day."DATE" - 1
        and second_day.IsPresent = 1
    where second_day.EmployeeId is null
    and first_day.IsPresent = 1)
select LB.EmployeeID, max(UpperDate - LowerDate + 1) as LongestStreak
from LowerBound LB
inner join UpperBound UB
    on LB.EmployeeId = UB.EmployeeId
    and LB.RN = UB.RN
group by LB.EmployeeId
groupby不见了

选择整个办公室每个人的总工时

    create table T (EmployeeId number(38) 
        , "DATE" date not null check ("DATE" = trunc("DATE"))
        , IsPresent number not null check (IsPresent in (0, 1))
        , constraint T_PK primary key (EmployeeId, "DATE")
    )
    /

    insert into T values (1, to_date('2000-01-01', 'YYYY-MM-DD'), 1);
    insert into T values (2, to_date('2000-01-01', 'YYYY-MM-DD'), 0);
    insert into T values (3, to_date('2000-01-01', 'YYYY-MM-DD'), 0);
    insert into T values (3, to_date('2000-01-02', 'YYYY-MM-DD'), 1);
    insert into T values (3, to_date('2000-01-03', 'YYYY-MM-DD'), 1);
    insert into T values (3, to_date('2000-01-04', 'YYYY-MM-DD'), 0);
    insert into T values (3, to_date('2000-01-05', 'YYYY-MM-DD'), 1);
    insert into T values (3, to_date('2000-01-06', 'YYYY-MM-DD'), 1);
    insert into T values (3, to_date('2000-01-07', 'YYYY-MM-DD'), 0);
    insert into T values (4, to_date('2000-01-01', 'YYYY-MM-DD'), 0);
    insert into T values (4, to_date('2000-01-02', 'YYYY-MM-DD'), 1);
    insert into T values (4, to_date('2000-01-03', 'YYYY-MM-DD'), 1);
    insert into T values (4, to_date('2000-01-04', 'YYYY-MM-DD'), 1);
    insert into T values (4, to_date('2000-01-05', 'YYYY-MM-DD'), 1);
    insert into T values (4, to_date('2000-01-06', 'YYYY-MM-DD'), 1);
    insert into T values (4, to_date('2000-01-07', 'YYYY-MM-DD'), 0);
    insert into T values (5, to_date('2000-01-01', 'YYYY-MM-DD'), 0);
    insert into T values (5, to_date('2000-01-02', 'YYYY-MM-DD'), 1);
    insert into T values (5, to_date('2000-01-03', 'YYYY-MM-DD'), 0);
    insert into T values (5, to_date('2000-01-04', 'YYYY-MM-DD'), 1);
    insert into T values (5, to_date('2000-01-05', 'YYYY-MM-DD'), 1);
    insert into T values (5, to_date('2000-01-06', 'YYYY-MM-DD'), 1);
    insert into T values (5, to_date('2000-01-07', 'YYYY-MM-DD'), 0);
选择每个员工的工时出勤率

Select Id,Count(*) from Employee where IsPresent=1
但这仍然不好,因为它计算的是总出勤天数,而不是连续出勤的时间长度

您需要做的是用另一个日期列date2构造一个临时表。日期2设置为今天。该表列出了员工缺勤的所有天数

Select Id,Count(*)
from Employee
where IsPresent=1
group by id;
因此,诀窍是计算两个缺席日之间的日期差,以找出连续当前日的长度。 现在,用每位员工的下一个缺勤日期填写date2。每个员工的最新记录将不会更新,但保留值为“今天”,因为数据库中没有日期大于今天的记录

create tmpdb.absentdates as
Select id, date, today as date2
from EMPLOYEE
where IsPresent=0
order by id, date;
您需要插入雇用日期,假定数据库中每个员工最早的日期是雇用日期

create tmpdb.absentdatesX as
Select id, date
from EMPLOYEE
where IsPresent=0
order by id, date;

create tmpdb.absentdates as
select *, today as date2
from tmpdb.absentdatesX;
现在用下一个晚一点的缺席日期更新date2,以便能够执行date2-date

insert into tmpdb.absentdates a
select a.id, min(e.date), today
from EMPLOYEE e
where a.id = e.id
但你只想保持最长的连胜:

select id, datediff(date2, date) as continuousPresence
from tmpdb.absentdates
group by id, continuousPresence
order by id, continuousPresence
然而,上述情况仍然存在问题,因为datediff没有考虑假期和周末

因此,我们将记录数视为合法工作日

select id, max(datediff(date2, date) as continuousPresence)
from tmpdb.absentdates
group by id
order by id
要列出条纹的日期,请执行以下操作:

select id, max(continuousPresence)
from tmpdb.absentCount
group by id

上面的sql server tsql可能有一些错误,但这是一般的想法。

这里有一个替代版本,以不同的方式处理缺失的天数。假设你只记录了一个工作日的记录,一周的星期一星期五和下一周的星期一星期五上班算作连续十天。此查询假定在一系列行的中间缺失的日期是非工作日。

select id, date, date2, continuousPresence
from tmpdb.absentCount
group by id
having continuousPresence = max(continuousPresence);

由于某些原因,我无法创建临时表…显示未知对象的错误…因此我使用select into语句。。
create tmpdb.absentCount as
Select a.id, a.date, a.date2, count(*) as continuousPresence
from EMPLOYEE e, tmpdb.absentdates a
where e.id = a.id
  and e.date >= a.date
  and e.date < a.date2
group by a.id, a.date
order by a.id, a.date;
select id, max(continuousPresence)
from tmpdb.absentCount
group by id
select id, date, date2, continuousPresence
from tmpdb.absentCount
group by id
having continuousPresence = max(continuousPresence);
with LowerBound as (select second_day.EmployeeId
        , second_day."DATE" as LowerDate
        , row_number() over (partition by second_day.EmployeeId 
            order by second_day."DATE") as RN
    from T second_day
    left outer join T first_day
        on first_day.EmployeeId = second_day.EmployeeId
        and first_day."DATE" = dateadd(day, -1, second_day."DATE")
        and first_day.IsPresent = 1
    where first_day.EmployeeId is null
    and second_day.IsPresent = 1)
, UpperBound as (select first_day.EmployeeId
        , first_day."DATE" as UpperDate
        , row_number() over (partition by first_day.EmployeeId 
            order by first_day."DATE") as RN
    from T first_day
    left outer join T second_day
        on first_day.EmployeeId = second_day.EmployeeId
        and first_day."DATE" = dateadd(day, -1, second_day."DATE")
        and second_day.IsPresent = 1
    where second_day.EmployeeId is null
    and first_day.IsPresent = 1)
select LB.EmployeeID, max(datediff(day, LowerDate, UpperDate) + 1) as LongestStreak
from LowerBound LB
inner join UpperBound UB
    on LB.EmployeeId = UB.EmployeeId
    and LB.RN = UB.RN
group by LB.EmployeeId

go

with NumberedRows as (select EmployeeId
        , "DATE"
        , IsPresent
        , row_number() over (partition by EmployeeId
            order by "DATE") as RN
--        , min("DATE") over (partition by EmployeeId, IsPresent) as MinDate
--        , max("DATE") over (partition by EmployeeId, IsPresent) as MaxDate
    from T)
, LowerBound as (select SecondRow.EmployeeId
        , SecondRow.RN
        , row_number() over (partition by SecondRow.EmployeeId 
            order by SecondRow.RN) as LowerBoundRN
    from NumberedRows SecondRow
    left outer join NumberedRows FirstRow
        on FirstRow.IsPresent = 1
        and FirstRow.EmployeeId = SecondRow.EmployeeId
        and FirstRow.RN + 1 = SecondRow.RN
    where FirstRow.EmployeeId is null
    and SecondRow.IsPresent = 1)
, UpperBound as (select FirstRow.EmployeeId
       , FirstRow.RN
       , row_number() over (partition by FirstRow.EmployeeId
            order by FirstRow.RN) as UpperBoundRN
    from NumberedRows FirstRow
    left outer join NumberedRows SecondRow
        on SecondRow.IsPresent = 1
        and FirstRow.EmployeeId = SecondRow.EmployeeId
        and FirstRow.RN + 1 = SecondRow.RN
    where SecondRow.EmployeeId is null
    and FirstRow.IsPresent = 1)
select LB.EmployeeId, max(UB.RN - LB.RN + 1)
from LowerBound LB 
inner join UpperBound UB
    on LB.EmployeeId = UB.EmployeeId
    and LB.LowerBoundRN = UB.UpperBoundRN
group by LB.EmployeeId