Warning: file_get_contents(/data/phpspider/zhask/data//catemap/7/sql-server/27.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql 仅当每个用户的连续天数等于或大于30天时获取记录_Sql_Sql Server_Sql Server 2005 - Fatal编程技术网

Sql 仅当每个用户的连续天数等于或大于30天时获取记录

Sql 仅当每个用户的连续天数等于或大于30天时获取记录,sql,sql-server,sql-server-2005,Sql,Sql Server,Sql Server 2005,我从查询中返回了以下数据。本质上,我把它放在一个临时表中,所以现在它在一个临时表中,我可以查询(很明显,在现实生活中有很多数据,我只是展示一个示例): 我只需要返回日期列中连续30天或以上的EmpId。我还需要返回这些连续工作30天或以上的员工的天数。可能有2组或多组不同的连续天数,即30天或更多天。在这个例子中,我想返回多行。因此,如果员工的日期为2011-01-01至2011-02-20,则返回此日期和一行中的计数。如果该员工的日期为2011-05-01至2011-07-01,则在另一行中返

我从查询中返回了以下数据。本质上,我把它放在一个临时表中,所以现在它在一个临时表中,我可以查询(很明显,在现实生活中有很多数据,我只是展示一个示例):


我只需要返回日期列中连续30天或以上的EmpId。我还需要返回这些连续工作30天或以上的员工的天数。可能有2组或多组不同的连续天数,即30天或更多天。在这个例子中,我想返回多行。因此,如果员工的日期为2011-01-01至2011-02-20,则返回此日期和一行中的计数。如果该员工的日期为2011-05-01至2011-07-01,则在另一行中返回该日期。基本上,连续几天的所有休息都被视为一个单独的记录。

像这样的事情应该可以做到,但还没有测试过

SELECT 
  a.empid
  , count(*) as consecutive_count
  , min(a.mydate) as startdate
FROM (SELECT * FROM logins ORDER BY mydate) a
INNER JOIN (SELECT * FROM logins ORDER BY mydate) b 
  ON (a.empid = b.empid AND datediff(day,a.mydate,b.mydate) = 1
GROUP BY a.empid, startdate
HAVING consecutive_count > 30
使用应做到以下几点:

;WITH sampledata
    AS (SELECT 1 AS id, DATEADD(day, -0, GETDATE())AS somedate
        UNION ALL SELECT 1, DATEADD(day, -1, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -2, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -3, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -4, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -5, GETDATE())
        UNION ALL SELECT 1, DATEADD(day, -10, GETDATE())
        UNION ALL SELECT 1, '2011-01-01 00:00:00'
        UNION ALL SELECT 1, '2010-12-31 00:00:00'
        UNION ALL SELECT 1, '2011-02-01 00:00:00'
        UNION ALL SELECT 1, DATEADD(day, -10, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, 0, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, -1, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, -2, GETDATE())
        UNION ALL SELECT 2, DATEADD(day, -6, GETDATE())
        UNION ALL SELECT 3, DATEADD(day, 0, GETDATE())
        UNION ALL SELECT 4, DATEADD(day, 0, GETDATE())
        UNION ALL SELECT 5, DATEADD(day, 0, GETDATE()))
   , ranking
    AS (SELECT *, DENSE_RANK()OVER(PARTITION BY id ORDER BY DATEDIFF(day, 0, somedate)) - DATEDIFF(day, 0, somedate)AS dategroup
          FROM sampledata)
    SELECT id
         , MIN(somedate)AS range_start
         , MAX(somedate)AS range_end
         , DATEDIFF(day, MIN(somedate), MAX(somedate)) + 1 AS consecutive_days
      FROM ranking
     GROUP BY id, dategroup
     --HAVING DATEDIFF(day, MIN(somedate), MAX(somedate)) + 1 >= 30 --change as needed
     ORDER BY id, range_start

这是递归CTE的一个很好的例子。我从@Davin那里偷了数据表:

with data AS --sample data 
( SELECT 1 as id ,DATEADD(DD,-0,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-1,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-2,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-3,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-4,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-5,GETDATE()) as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-10,GETDATE()) as date UNION ALL 
SELECT 1 as id ,'2011-01-01 00:00:00.000' as date UNION ALL 
SELECT 1 as id ,'2010-12-31 00:00:00.000' as date UNION ALL 
SELECT 1 as id ,'2011-02-01 00:00:00.000' as date UNION ALL 
SELECT 1 as id ,DATEADD(DD,-10,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,-1,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,-2,GETDATE()) as date UNION ALL 
SELECT 2 as id ,DATEADD(DD,-6,GETDATE()) as date UNION ALL 
SELECT 3 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL 
SELECT 4 as id ,DATEADD(DD,0,GETDATE()) as date UNION ALL 
SELECT 5 as id ,DATEADD(DD,0,GETDATE()) as date   ) 
,CTE AS
(
    SELECT id, CAST(date as date) Date, Consec = 1
    FROM data
    UNION ALL
    SELECT t.id, CAST(t.date as DATE) Date, Consec = (c.Consec + 1)
    FROM data T
    INNER JOIN CTE c
        ON T.id = c.id
        AND CAST(t.date as date) = CAST(DATEADD(day, 1, c.date) as date)

)

SELECT id, MAX(consec)
FROM CTE
GROUP BY id
ORDER BY id

基本上,这会为每个人生成许多行,并测量每个日期代表的行中的天数。

假设同一员工没有重复的日期:

;WITH ranged AS (
  SELECT
    EmpId,
    Date,
    RangeId = DATEDIFF(DAY, 0, Date)
            - ROW_NUMBER() OVER (PARTITION BY EmpId ORDER BY Date)
  FROM atable
)
SELECT
  EmpId,
  StartDate = MIN(Date),
  EndDate   = MAX(Date),
  DayCount  = DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1
FROM ranged
GROUP BY EmpId, RangeId
HAVING DATEDIFF(DAY, MIN(Date), MAX(Date)) + 1 >= 30
ORDER BY EmpId, MIN(Date)
DATEDIFF将日期转换为整数(0日期(
1900-01-01
)和
date
)之间的天数差)。如果日期是连续的,则整数也是连续的。以问题中的数据样本为例,DATEDIFF结果为:

EmpId  Date        DATEDIFF
-----  ----------  --------
1      2011-01-01  40542
1      2011-01-02  40543
1      2011-01-03  40544
2      2011-02-03  40575
3      2011-03-01  40601
4      2011-03-02  40602
5      2011-01-02  40543
现在,如果您获取每个员工的行,按日期顺序为其分配行号,并获取数字表示形式和行号之间的差异,您将发现连续数字(因此,连续日期)的差异保持不变。使用稍微不同的示例进行更好的说明,它将如下所示:

Date        DATEDIFF  RowNum  RangeId
----------  --------  ------  -------
2011-01-01  40542     1       40541
2011-01-02  40543     2       40541
2011-01-03  40544     3       40541
2011-01-05  40546     4       40542
2011-01-07  40548     5       40543
2011-01-08  40549     6       40543
2011-01-09  40550     7       40543
RangeId
的具体值并不重要,重要的是它在连续日期中保持不变。基于这一事实,您可以使用它作为分组标准来计算组中的日期并获得范围边界


上面的查询使用
DATEDIFF(DAY,MIN(Date),MAX(Date))+1
来计算天数,但您也可以简单地使用
count(*)

检查这个问题:适用于同一个月内的范围,但在月份转换时会中断。@Andriy M-更改了它,以便它也可以跨年跨月工作。

,事实证明我们的想法是一样的ROW_NUMBER()可能比DENS_RANK()快,但是使用稠密的_RANK()应该可以解释重复的日期。(只有我会在OVER条款中将
orderbydate
更改为
orderbydatediff(day,0,date)
。@Andriy M-同意,认为解释重复更有用。你知道如何将代码从这里复制并粘贴到SSMS中,而不只是将其全部放在SSMS中的一行吗?我不认为如果一个id有超过一个连续30天的范围,这将返回多行…我知道这是一篇非常古老的帖子,但是你能解释RangeId是如何工作的吗?我不明白您是如何找到一个可以使用Row_Number()进行分组的值的。
Date        DATEDIFF  RowNum  RangeId
----------  --------  ------  -------
2011-01-01  40542     1       40541
2011-01-02  40543     2       40541
2011-01-03  40544     3       40541
2011-01-05  40546     4       40542
2011-01-07  40548     5       40543
2011-01-08  40549     6       40543
2011-01-09  40550     7       40543