SQL Server 2008:如何选择两个连续会话之间的差异小于10分钟的所有会话的总和
我有一个存储用户聊天信息的表。每条消息都记录在此表中。我必须计算特定用户的聊天持续时间 因为用户有可能在x时间聊天,在x+10次之后,他离开聊天室。在X+20次之后,用户再次开始聊天。因此,不应考虑x+10和x+20之间的时间段 表结构和示例数据如图所示。不同的颜色代表同一用户的两个聊天会话。正如我们所看到的,663和662之间的差异超过1小时,因此此类会话应该从结果集中排除。最终结果应为2.33分钟SQL Server 2008:如何选择两个连续会话之间的差异小于10分钟的所有会话的总和,sql,sql-server,sql-server-2008,Sql,Sql Server,Sql Server 2008,我有一个存储用户聊天信息的表。每条消息都记录在此表中。我必须计算特定用户的聊天持续时间 因为用户有可能在x时间聊天,在x+10次之后,他离开聊天室。在X+20次之后,用户再次开始聊天。因此,不应考虑x+10和x+20之间的时间段 表结构和示例数据如图所示。不同的颜色代表同一用户的两个聊天会话。正如我们所看到的,663和662之间的差异超过1小时,因此此类会话应该从结果集中排除。最终结果应为2.33分钟 declare @messagetime1 as datetime declare @mess
declare @messagetime1 as datetime
declare @messagetime2 as datetime
select @messagetime1=messagetime from tbl_chatMessages where ID=662
select @messagetime2=messagetime from tbl_chatMessages where ID=659
print datediff(second,@messagetime2,@messagetime1)
Result --- 97 seconds
declare @messagetime3 as datetime
declare @messagetime4 as datetime
select @messagetime3=messagetime from tbl_chatMessages where ID=668
select @messagetime4=messagetime from tbl_chatMessages where ID=663
print datediff(second,@messagetime4,@messagetime3)
Result -- 43 seconds
请建议一个计算聊天持续时间的解决方案。这是我能想到的逻辑之一,以防你们有更好的想法。请与解决方案共享尝试以下方法:
WITH DATA
AS (SELECT t1.*,
CASE
WHEN
Isnull(Datediff(MI, t2.MESSAGETIME, t1.MESSAGETIME), 11) > 10
THEN 0
ELSE 1
END first_ident
FROM TABLE1 t1
LEFT JOIN TABLE1 t2
ON t1.ID = t2.ID + 1),
CTE
AS (SELECT ID,
MESSAGETIME,
ID gid,
0 AS tot_time
FROM DATA
WHERE FIRST_IDENT = 0
UNION ALL
SELECT t1.ID,
t1.MESSAGETIME,
t2.GID,
t2.TOT_TIME
+ Datediff(MI, t2.MESSAGETIME, t1.MESSAGETIME)
FROM DATA t1
INNER JOIN CTE t2
ON t1.ID = t2.ID + 1
AND t1.FIRST_IDENT = 1)
SELECT GID,
Max(TOT_TIME) Tot_time
FROM CTE
GROUP BY GID
我在这方面树立了一个工作榜样。如果您有任何问题,请查看并告诉我。首先需要计算相邻消息之间的间隔,如果间隔超过600秒,则这些消息之间的时间为0
SELECT SUM(o.duration) / 60.00 AS duration
FROM dbo.tbl_chatMessages t1
OUTER APPLY (
SELECT TOP 1
CASE WHEN DATEDIFF(second, t2.messageTime, t1.messageTime) > 600
THEN 0
ELSE DATEDIFF(second, t2.messageTime, t1.messageTime) END
FROM dbo.tbl_chatMessages t2
WHERE t1.messageTime > t2.messageTime
ORDER BY t2.messageTime DESC
) o(duration)
请参见上的演示,以下是我的解决方案背后的原因。首先,确定开始聊天时段的每个聊天。您可以使用一个标志来完成此操作,该标志用于标识距离上一次聊天时间超过10分钟的聊天 然后,使用此标志并进行累积求和。该总和实际上用作聊天时段的分组标识符。最后,汇总结果以获得每个聊天时段的信息
with cmflag as (
select cm.*,
(case when datediff(min, prevmessagetime, messagetime) > 10
then 0
else 1
end) as ChatPeriodStartFlag
from (select cm.*,
(select top 1 messagetime
from tbl_chatMessages cm2
where cm2.senderId = cm.senderId or
cm2.RecipientId = cm.senderId
) as prevmessagetme
from tbl_chatMessages cm
) cm
),
cmcum as (
select cm.*,
(select sum(ChatPeriodStartFlag)
from cmflag cmf
where cm2.senderId = cm.senderId or
cm2.RecipientId = cm.senderId and
cmf.messagetime <= cm.messagetime
) as ChatPeriodGroup
from tbl_chatMessages cm
)
select cm.SenderId, ChatPeriodGroup, min(messageTime) as mint, max(messageTime) as maxT
from cmcum
group by cm.SenderId, ChatPeriodGroup;
我可能无法完全理解的一个挑战是如何在发件人和收件人之间进行匹配。示例数据中的所有行都具有相同的对。这是从SenderId的角度来看用户,但考虑到在聊天期间,用户可能是发件人或收件人。您可以使用以下查询:
当然,如果可能的话,我会关注表结构的轻微修改和更新聊天服务器应用程序代码
您是否可以让聊天服务器在每次消息之间的延迟超过X分钟时生成新的聊天ID?如果是,则计算聊天持续时间将变得非常简单。标题中显示了大约1分钟,但什么标准表明此数据中有两个单独的聊天会话,而不是一个更长的会话。换句话说,您如何知道用户在第660行和第661行之间聊天间隔超过一分钟,但在第662行和第663行之间不聊天间隔也超过一分钟?您如何定义两种不同的聊天?从你发布的数据来看,我看不出有什么办法可以区分这两种情况。@SteveKass:谢谢你指出这一点。我必须保持一个可配置的空闲时间,该时间等于会话注销时间~10-15minutes@Gidil:无法区分两个聊天…这就是为什么我使用间隔路由的原因这些条目来自我无法控制的dll。因此,ChatID列似乎被保留了,但没有被使用。@BogdanSahlean:相同的senderID和receipantID是会话的公共元素
DECLARE @Results TABLE(
RowNum INT NOT NULL,
senderID INT NOT NULL DEFAULT(80),
recipientID INT NOT NULL DEFAULT(79),
PRIMARY KEY(RowNum,senderID,recipientID),
messageTime DATETIME NOT NULL
);
INSERT INTO @Results(RowNum,senderID,recipientID,messageTime)
SELECT ROW_NUMBER() OVER(PARTITION BY senderID,recipientID ORDER BY messageTime, ID) AS RowNum,
c.senderID,c.recipientID,c.messageTime
FROM dbo.tbl_chatMessages c;
WITH RecursiveCTE
AS(
SELECT crt.RowNum,crt.senderID,crt.recipientID,
crt.messageTime,
1 AS SessionID
FROM @Results crt
WHERE crt.RowNum=1
UNION ALL
SELECT crt.RowNum,crt.senderID,crt.recipientID,
crt.messageTime,
CASE
WHEN DATEDIFF(MINUTE,prev.messageTime,crt.messageTime) <= 10 THEN prev.SessionID
ELSE prev.SessionID+1
END
FROM @Results crt INNER JOIN RecursiveCTE prev ON crt.RowNum=prev.RowNum+1
AND crt.senderID=prev.senderID
AND crt.recipientID=prev.recipientID
)
SELECT *,
STUFF(CONVERT(VARCHAR(8), DATEADD(SECOND,x.SessionDuration,0), 114), 1,3,'') AS SessionDuration_mmss,
SUM(x.SessionDuration) OVER() AS SessionDuration_Overall,
STUFF(CONVERT(VARCHAR(8), DATEADD(SECOND,SUM(x.SessionDuration) OVER(),0), 114), 1,3,'') AS SessionDuration_Overall_mmss
FROM(
SELECT r.senderID,r.recipientID,r.SessionID,
DATEDIFF(SECOND, MIN(r.messageTime),MAX(r.messageTime)) AS SessionDuration
FROM RecursiveCTE r
GROUP BY r.senderID,r.recipientID,r.SessionID
) x
OPTION(MAXRECURSION 0);
senderID recipientID SessionID SessionDuration SessionDuration_mmss SessionDuration_Overall SessionDuration_Overall_mmss
-------- ----------- ----------- --------------- -------------------- ----------------------- ----------------------------
80 79 1 97 01:37 140 02:20
80 79 2 43 00:43 140 02:20