Sql 滞后函数与和

Sql 滞后函数与和,sql,sql-server,datetime,gaps-and-islands,date-arithmetic,Sql,Sql Server,Datetime,Gaps And Islands,Date Arithmetic,我需要获得每天至少离线20分钟的用户列表。这是我的数据 我有这个起始查询,但我一直在思考如何将脱机分钟数的差异相加,即需要将脱机分钟数>=20添加到where子句中 SELECT userid, connected, LAG(recordeddt) OVER(PARTITION BY userid ORDER BY userid, recordeddt) AS offline_period, DATEDIFF(mi

我需要获得每天至少离线20分钟的用户列表。这是我的数据

我有这个起始查询,但我一直在思考如何将脱机分钟数的差异相加,即需要将脱机分钟数>=20添加到where子句中

SELECT  
   userid, 
    connected,
   LAG(recordeddt) OVER(PARTITION BY userid
   ORDER BY userid, 
            recordeddt) AS offline_period,
            DATEDIFF(minute, LAG(recordeddt) OVER(PARTITION BY userid
   ORDER BY userid, 
            recordeddt),recordeddt)  offline_mins
FROM device_data where connected=0; 
我的预期结果:


提前感谢。

这听起来像是一个间隙和孤岛问题,您希望将具有相同用户ID和状态的相邻行分组在一起

首先,这里是一个计算孤岛的查询:

select userid, connected, min(recordeddt) startdt, max(lead_recordeddt) enddt,
    datediff(min(recordeddt), max(lead_recordeddt)) duration
from (
    select dd.*,
        row_number()     over(partition by userid order by recordeddt) rn1,
        row_number()     over(partition by userid, connected order by recordeddt) rn2,
        lead(recordeddt) over(partition by userid order by recordeddt) lead_recordeddt
    from device_data dd
) dd
group by userid, connected, rn1 - rn2
现在,假设您希望用户每天至少离线20分钟。您可以每天分解孤岛,并使用having子句进行筛选:

select userid
from (
    select recordedday, userid, connected,
        datediff(min(recordeddt), max(lead_recordeddt)) duration
    from (
        select dd.*, v.*,
            row_number()     over(partition by v.recordedday, userid order by recordeddt) rn1,
            row_number()     over(partition by v.recordedday, userid, connected order by recordeddt) rn2,
            lead(recordeddt) over(partition by v.recordedday, userid order by recordeddt) lead_recordeddt
        from device_data dd
        cross apply (values (convert(date, recordeddt))) v(recordedday)
    ) dd
    group by convert(date, recordeddt), userid, connected, rn1 - rn2
) dd
group by userid
having count(distinct case when connected = 0 and duration >= 20 then recordedday end) = count(distinct recordedday)

如前所述,这是一个缺口和孤岛问题。这是我用一个简单的lag函数创建组,过滤出连接的行,然后处理日期范围

CREATE TABLE #tmp(ID int, UserID int, dt datetime, connected int)
INSERT INTO #tmp VALUES
(1,1,'11/2/20 10:00:00',1),
(2,1,'11/2/20 10:05:00',0),
(3,1,'11/2/20 10:10:00',0),
(4,1,'11/2/20 10:15:00',0),
(5,1,'11/2/20 10:20:00',0),
(6,2,'11/2/20 10:00:00',1),
(7,2,'11/2/20 10:05:00',1),
(8,2,'11/2/20 10:10:00',0),
(9,2,'11/2/20 10:15:00',0),
(10,2,'11/2/20 10:20:00',0),
(11,2,'11/2/20 10:25:00',0),
(12,2,'11/2/20 10:30:00',0)


SELECT UserID, connected,DATEDIFF(minute,MIN(DT), MAX(DT)) OFFLINE_MINUTES 
FROM
(
    SELECT *, SUM(CASE WHEN connected <> LG THEN 1 ELSE 0 END) OVER (ORDER BY UserID,dt) grp
    FROM
    (
        select *, LAG(connected,1,connected) OVER(PARTITION BY UserID ORDER BY UserID,dt) LG
        from #tmp
    ) x
) y
WHERE connected <> 1
GROUP BY UserID,grp,connected
HAVING DATEDIFF(minute,MIN(DT), MAX(DT)) >= 20

请显示您的预期结果。1是否每个用户和每5分钟都有记录?2在每天脱机至少20分钟的用户中定义每天。每天=24小时。可能不是每5分钟就有一条记录,但当用户最终联机时,会有一条connected=1的记录