如何在SQL中将顺序的、带时间戳的行组合在一起,并返回每个组的日期范围

如何在SQL中将顺序的、带时间戳的行组合在一起,并返回每个组的日期范围,sql,sql-server-2008,Sql,Sql Server 2008,我有一个MS SQL 2008数据库表,如下所示: 注册|日期| DriverID | TrailerID 下面是一些数据的示例: AB53EDH,2013/07/03 10:00,54,23 AB53EDH,2013/07/03 10:01,54,23 ... AB53EDH,2013/07/03 10:45,54,23 AB53EDH,2013/07/03 10:46,54,NULL <-- Trailer changed AB53EDH,2013/07/03 10:47,54,NUL

我有一个MS SQL 2008数据库表,如下所示:

注册|日期| DriverID | TrailerID

下面是一些数据的示例:

AB53EDH,2013/07/03 10:00,54,23
AB53EDH,2013/07/03 10:01,54,23
...
AB53EDH,2013/07/03 10:45,54,23
AB53EDH,2013/07/03 10:46,54,NULL <-- Trailer changed
AB53EDH,2013/07/03 10:47,54,NULL
...
AB53EDH,2013/07/03 11:05,54,NULL
AB53EDH,2013/07/03 11:06,54,102  <-- Trailer changed
AB53EDH,2013/07/03 11:07,54,102
...
AB53EDH,2013/07/03 12:32,54,102
AB53EDH,2013/07/03 12:33,72,102  <-- Driver changed
AB53EDH,2013/07/03 12:34,72,102
您将如何通过SQL实现这一点

更新:感谢到目前为止的答案。不幸的是,当我将其应用于我拥有的生产数据时,它们停止了工作。目前提交的查询在应用于部分数据时无法正常工作

下面是一些用于生成数据表并用上面的虚拟数据填充它的示例查询。这里的数据比上述示例中的数据要多:重复了驾驶员、拖车组合54、23和54、NULL,以确保查询识别出这是两个不同的组。我还使用不同的日期范围复制了相同的数据三次,以测试在部分数据集上运行查询时是否有效:

CREATE TABLE [dbo].[TempTable](
    [Registration] [nvarchar](50) NOT NULL,
    [Date] [datetime] NOT NULL,
    [DriverID] [int] NULL,
    [TrailerID] [int] NULL
)

INSERT INTO dbo.TempTable
VALUES 
('AB53EDH','2013/07/03 10:00', 54,23),
('AB53EDH','2013/07/03 10:01', 54,23),
('AB53EDH','2013/07/03 10:45', 54,23),
('AB53EDH','2013/07/03 10:46', 54,NULL),
('AB53EDH','2013/07/03 10:47', 54,NULL),
('AB53EDH','2013/07/03 11:05', 54,NULL),
('AB53EDH','2013/07/03 11:06', 54,102),
('AB53EDH','2013/07/03 11:07', 54,102),
('AB53EDH','2013/07/03 12:32', 54,102),
('AB53EDH','2013/07/03 12:33', 72,102),
('AB53EDH','2013/07/03 12:34', 72,102),
('AB53EDH','2013/07/03 13:00', 54,102),
('AB53EDH','2013/07/03 13:01', 54,102),
('AB53EDH','2013/07/03 13:02', 54,102),
('AB53EDH','2013/07/03 13:03', 54,102),
('AB53EDH','2013/07/03 13:04', 54,23),
('AB53EDH','2013/07/03 13:05', 54,23),
('AB53EDH','2013/07/03 13:06', 54,23),
('AB53EDH','2013/07/03 13:07', 54,NULL),
('AB53EDH','2013/07/03 13:08', 54,NULL),
('AB53EDH','2013/07/03 13:09', 54,NULL),
('AB53EDH','2013/07/03 13:10', 54,NULL),
('AB53EDH','2013/07/03 13:11', NULL,NULL)

INSERT INTO dbo.TempTable
SELECT Registration, DATEADD(M, -1, Date), DriverID, TrailerID
FROM dbo.TempTable
WHERE Date > '2013/07/01'

INSERT INTO dbo.TempTable
SELECT Registration, DATEADD(M, 1, Date), DriverID, TrailerID
FROM dbo.TempTable
WHERE Date > '2013/07/01'
试试-:


下面是一种使用相关子查询的方法:

with tt as (
       select tt.*,
              (select top 1 date
               from TempTable tt2
               where tt2.Registration = tt.Registration and
                     tt2.DriverID = tt.DriverID and
                     (tt2.TrailerID = tt.TrailerID or tt2.TrailerID is null and tt.TrailerID is null) and
                     tt2.Date < tt.Date
               order by date desc
              ) prevDate
       from TempTable tt
      )
select registration, min(date) as startdate, max(date) as enddate, driverid, trailerid
from (select tt.*,
             (select top 1 date
              from tt tt3
              where prevDate is NULL and
                    tt3.Date <= tt.date
              order by Date desc
             ) as grp
      from TempTable tt
     ) tt
group by grp, Registration, DriverID, trailerid;

此查询使用CTE执行以下操作:

创建按注册分组的有序记录集合 对于每个记录,捕获上一个记录的数据 比较当前和以前的数据以确定当前记录 是驾驶员/拖车分配的新实例 只获取新记录 对于每个新记录,获取新驾驶员/拖车之前的最后日期 分配发生 链接到

代码如下:

;WITH c AS (
-- Group records by Registration, assign row numbers in order of date
SELECT
  ROW_NUMBER() OVER (
    PARTITION BY Registration 
    ORDER BY Registration, [Date]) 
  AS Rn,
  Registration,
  [Date],
  DriverID,
  TrailerID
FROM 
  TempTable
)
,c2 AS (
-- Self join to table to get Driver and Trailer from previous record
SELECT 
  t1.Rn,
  t1.Registration,
  t1.[Date],
  t1.DriverID,
  t1.TrailerID,
  t2.DriverID AS PrevDriverID,
  t2.TrailerID AS PrevTrailerID
FROM 
  c t1
LEFT OUTER JOIN 
  c t2
ON 
  t1.Registration = t2.Registration
AND 
  t2.Rn = t1.Rn - 1 
)
,c3 AS (
-- Use INTERSECT to determine if this record is new in sequence
SELECT
  Rn,
  Registration,
  [Date],
  DriverID,
  TrailerID,
  CASE WHEN NOT EXISTS (
            SELECT DriverID, TrailerID 
            INTERSECT 
            SELECT PrevDriverID, PrevTrailerID) 
       THEN 1
       ELSE 0
  END AS IsNew
FROM c2 
) 
-- For all new records in sequence, 
-- get the last date logged before a new record appeared
SELECT 
  Registration,
  [Date] AS StartDate,
  COALESCE (
    (
       SELECT TOP 1 [Date]
       FROM c3 
       WHERE Registration = t.Registration
       AND Rn < (
         SELECT TOP 1 Rn
         FROM c3 
         WHERE Registration = t.Registration 
         AND Rn > t.Rn 
         AND IsNew = 1 
         ORDER BY Rn )
       ORDER BY Rn DESC 
    )
    , [Date]) AS EndDate,
  DriverID,
  TrailerID
FROM 
  c3 t
WHERE
  IsNew = 1 
ORDER BY 
  Registration,
  StartDate

我认为预期结果数据中存在错误:AB53EDH,2013/07/03 10:062013/07/03 12:32,54102应该是AB53EDH,2013/07/03 1**1**:062013/07/03 12:32,54102+1。您的问题中有工作代码。这是一个创作灵感。@armen:谢谢-corrected@Amr,使用空驱动程序和空拖车的注册应该如何显示在摘要中?@8kb:它们是一个有效的组合,应该像任何其他组合一样在结果中表示。遗憾的是,此方法仍然存在问题。如果相同的驾驶员和拖车组合稍后出现,则查询将它们视为一个组,而不是识别出这两个不同的组由一个时间段分隔。我已经更新了原始问题,添加了更多数据点来帮助说明这一点。你会注意到54,23和54的组合,对于司机来说是空的,拖车重复了两次,但是你的查询结果并没有反映出这一点。哦,现在这是一个完全不同的故事,一旦我提出了一个与你的问题非常相似的问题,它仍然需要回答:,我很高兴听到这篇文章的任何消息!真可惜!我唯一能想到的另一种方法是按顺序使用游标,但对于大量数据,我发现这可能会非常缓慢,我希望如果有一种基于集合的方法来执行相同的任务,我可能会获得更好的性能。我们将看到进一步的发展。首先,感谢您抽出时间回答我的问题。当我一直在研究这个查询并试图将其应用到我的生产数据时,我意识到不幸的是,它并不是在所有情况下都有效。查询似乎依赖于每个驱动程序/拖车组合前面有另一个组合,即使其prevDate设置为NULL的组合是不同的,也就是说,这是该组合在数据中的第一次出现。不幸的是,在生产数据中,情况并非如此,因为所有组合都可能在同一数据范围内多次出现。@AmrBekhit。我不理解你的评论。此查询根据日期捕获具有相同注册、driverId和trailerId的多个行序列。同一个三元组可以在不同的日期出现多次,并且它们将在数据中相应地出现多次。prevDate的计算决定了每个序列的起始位置。我修改了您的查询以处理我已将查询粘贴到此处的表中的数据子集:。尝试使用问题中更新的代码和新表重新创建诱惑,然后运行查询。您会注意到,对于2013/08/03的组,分组是不正确的。事实上,即使您只是重新创建临时表并运行原始查询,您仍然会看到此问题。
with tt as (
       select tt.*,
              (select top 1 date
               from TempTable tt2
               where tt2.Registration = tt.Registration and
                     tt2.DriverID = tt.DriverID and
                     (tt2.TrailerID = tt.TrailerID or tt2.TrailerID is null and tt.TrailerID is null) and
                     tt2.Date < tt.Date
               order by date desc
              ) prevDate
       from TempTable tt
      )
select registration, min(date) as startdate, max(date) as enddate, driverid, trailerid
from (select tt.*,
             (select top 1 date
              from tt tt3
              where prevDate is NULL and
                    tt3.Date <= tt.date
              order by Date desc
             ) as grp
      from TempTable tt
     ) tt
group by grp, Registration, DriverID, trailerid;
with tt as (
       select tt.*, tt3.date as PrevDate
       from (select tt.*,
                    (select top 1 date
                     from TempTable tt2
                     where tt2.date < tt.date
                     order by date desc
                   ) prevDate1
             from TempTable tt
            ) tt left outer join
            TempTable tt3
            on tt.prevdate1 = tt3.date and
               tt3.Registration = tt.Registration and
               tt3.DriverID = tt.DriverID and
               (tt3.TrailerID = tt.TrailerID or tt3.TrailerID is null and tt.TrailerID is null)
     )
select registration, count(*), min(date) as startdate, max(date) as enddate, driverid, trailerid
from (select tt.*,
             (select top 1 date
              from tt tt3
              where prevDate is NULL and
                    tt3.Date <= tt.date
              order by Date desc
             ) as grp
      from TempTable tt
     ) tt
group by grp, Registration, DriverID, trailerid;
;WITH c AS (
-- Group records by Registration, assign row numbers in order of date
SELECT
  ROW_NUMBER() OVER (
    PARTITION BY Registration 
    ORDER BY Registration, [Date]) 
  AS Rn,
  Registration,
  [Date],
  DriverID,
  TrailerID
FROM 
  TempTable
)
,c2 AS (
-- Self join to table to get Driver and Trailer from previous record
SELECT 
  t1.Rn,
  t1.Registration,
  t1.[Date],
  t1.DriverID,
  t1.TrailerID,
  t2.DriverID AS PrevDriverID,
  t2.TrailerID AS PrevTrailerID
FROM 
  c t1
LEFT OUTER JOIN 
  c t2
ON 
  t1.Registration = t2.Registration
AND 
  t2.Rn = t1.Rn - 1 
)
,c3 AS (
-- Use INTERSECT to determine if this record is new in sequence
SELECT
  Rn,
  Registration,
  [Date],
  DriverID,
  TrailerID,
  CASE WHEN NOT EXISTS (
            SELECT DriverID, TrailerID 
            INTERSECT 
            SELECT PrevDriverID, PrevTrailerID) 
       THEN 1
       ELSE 0
  END AS IsNew
FROM c2 
) 
-- For all new records in sequence, 
-- get the last date logged before a new record appeared
SELECT 
  Registration,
  [Date] AS StartDate,
  COALESCE (
    (
       SELECT TOP 1 [Date]
       FROM c3 
       WHERE Registration = t.Registration
       AND Rn < (
         SELECT TOP 1 Rn
         FROM c3 
         WHERE Registration = t.Registration 
         AND Rn > t.Rn 
         AND IsNew = 1 
         ORDER BY Rn )
       ORDER BY Rn DESC 
    )
    , [Date]) AS EndDate,
  DriverID,
  TrailerID
FROM 
  c3 t
WHERE
  IsNew = 1 
ORDER BY 
  Registration,
  StartDate