Tsql 总结数据孤岛
我已经读了很多关于数据孤岛的书,并使用CTE或许多子查询进行了总结。大多数人似乎依靠聪明的数学来计算日期,这看起来很酷,但我认为这对我不起作用 我们有许多车辆数据记录器根据不同的时间表发送状态更新。我正在寻找一种更快、非循环的方法来总结某些状态Tsql 总结数据孤岛,tsql,common-table-expression,gaps-and-islands,Tsql,Common Table Expression,Gaps And Islands,我已经读了很多关于数据孤岛的书,并使用CTE或许多子查询进行了总结。大多数人似乎依靠聪明的数学来计算日期,这看起来很酷,但我认为这对我不起作用 我们有许多车辆数据记录器根据不同的时间表发送状态更新。我正在寻找一种更快、非循环的方法来总结某些状态 节点ID设备标识符 LogId日志条目主键 AssembledTime在设备上组装记录时 ReceivedTime在服务器上接收记录时 速度记录时的速度 StatusText可以包含多个关键字 数据通常在跳闸结束时进行处理。(点火开关打开至点火开关关
- 节点ID设备标识符
- LogId日志条目主键
- AssembledTime在设备上组装记录时
- ReceivedTime在服务器上接收记录时
- 速度记录时的速度
- StatusText可以包含多个关键字
SELECT RowNumber = ROW_NUMBER() OVER(ORDER BY l.AssembledTime),
l.NodeId,
l.LogId,
l.AssembledTime,
lm.Speed,
lm.StatusText,
StatusSpeed = CASE WHEN lm.StatusText like '%speed%' THEN 1 ELSE 0 END,
StatusAccident = CASE WHEN lm.StatusText like '%accident%' THEN 1 ELSE 0 END, --impact?
StatusSeatbeltDriving = CASE WHEN (lm.StatusText like '%seatbelt%' or lm.StatusText like '%s/b%') and lm.Speed > 10 THEN 1 ELSE 0 END,
StatusSeatbeltIdle = CASE WHEN (lm.StatusText like '%seatbelt%' or lm.StatusText like '%s/b%') and lm.Speed = 0 THEN 1 ELSE 0 END,
Status4wd = CASE WHEN (lm.StatusText like '%4wd%' or lm.StatusText like '%4x4%') THEN 1 ELSE 0 END
FROM Ctrack6.dbo.Logs l
JOIN Ctrack6.dbo.LogMobiles lm on l.LogId = lm.LogId
WHERE l.NodeId = @NodeId
AND l.AssembledTime between @TripStart AND @TripEnd
这将为我提供设备跳闸的所有日志列表,顺序如下:
RowNumber NodeId LogId AssembledTime Speed StatusText StatusSpeed StatusAccident StatusSeatbeltDriving StatusSeatbeltIdle Status4wd IsProcessed
1 3099 308815155 2015-05-26 11:05:43.000 0 Start up 0 0 0 0 0 0
2 3099 308815156 2015-05-26 11:05:55.000 0 Driving 0 0 0 0 0 0
3 3099 308815157 2015-05-26 11:06:25.000 10 Driving 0 0 0 0 0 0
4 3099 308815158 2015-05-26 11:06:45.000 11 Driving 0 0 0 0 0 0
5 3099 308815344 2015-05-26 11:07:15.000 0 Driving 0 0 0 0 0 0
6 3099 308815345 2015-05-26 11:07:16.000 0 Seatbelt 0 0 0 1 0 0
7 3099 308815477 2015-05-26 11:07:19.000 0 Seatbelt 0 0 0 1 0 0
8 3099 308815479 2015-05-26 11:07:24.000 0 Seatbelt 0 0 0 1 0 0
9 3099 308815481 2015-05-26 11:07:29.000 0 Seatbelt 0 0 0 1 0 0
10 3099 308815482 2015-05-26 11:07:34.000 0 Seatbelt 0 0 0 1 0 0
11 3099 308815598 2015-05-26 11:07:39.000 0 Seatbelt 0 0 0 1 0 0
12 3099 308815599 2015-05-26 11:07:44.000 0 Seatbelt 0 0 0 1 0 0
13 3099 308815600 2015-05-26 11:07:49.000 0 Seatbelt 0 0 0 1 0 0
14 3099 308815601 2015-05-26 11:07:54.000 0 Seatbelt 0 0 0 1 0 0
15 3099 308815729 2015-05-26 11:08:00.000 0 Seatbelt 0 0 0 1 0 0
16 3099 308815730 2015-05-26 11:08:05.000 0 Seatbelt 0 0 0 1 0 0
17 3099 308815731 2015-05-26 11:08:10.000 0 Seatbelt 0 0 0 1 0 0
18 3099 308815732 2015-05-26 11:08:15.000 0 Seatbelt 0 0 0 1 0 0
19 3099 308816439 2015-05-26 11:08:45.000 0 Seatbelt 0 0 0 1 0 0
20 3099 308816440 2015-05-26 11:09:15.000 0 Seatbelt 0 0 0 1 0 0
21 3099 308816441 2015-05-26 11:09:45.000 0 Seatbelt 0 0 0 1 0 0
22 3099 308816442 2015-05-26 11:10:07.000 0 Ignition off 0 0 0 0 0 0
预期结果将汇总第6-21行。与
- NodeId-设备的NodeId
- STARTOGID第6行的LogId
- EndLogId第21行的LogId
- 事件开始时间第6行的集合时间
- EventEndTime第21行的集合时间
- 事件类型“安全带”
SELECT 1
WHILE @@ROWCOUNT<> 0
BEGIN
UPDATE TGT
SET ChangeColumn=1
FROM YourTable TGT
INNER JOIN YourTable PriorRow
WHERE PriorRow.RowNum-1 = TGT.RowNum
AND PriorRow.State = TGT.State
AND ChangeColumn=0
END
选择1
而@@ROWCOUNT 0
开始
更新TGT
设置ChangeColumn=1
从你的桌子上
请先把你的桌子连接起来
其中PriorRow.RowNum-1=TGT.RowNum
和PriorRow.State=TGT.State
和ChangeColumn=0
终止
如果您运行此程序,它将一直运行,直到找到并标记所有状态更改为止。我在下一页找到了我的解决方案: 更具体地说: 添加了两个表变量
declare @logs table
(
LogId int PRIMARY KEY,
RowNumberAll int,
RowNumberNode int,
NodeId int,
AssembledTime datetime,
Speed int,
StatusText varchar(200),
StatusSpeed bit,
StatusAccident bit,
StatusSeatbeltDriving bit,
StatusSeatbeltIdle bit,
Status4wd bit,
UNIQUE(Nodeid, RowNumberNode),
UNIQUE(RowNumberAll)
)
declare @results table
(
EventType varchar(50),
NodeId int,
StartSeqNo int,
EndSeqNo int,
LogCount int,
UNIQUE(NodeId, StartSeqNo, EventType)
)
向查询中添加了额外的列RowNumberNode
RowNumberNode = ROW_NUMBER() OVER(PARTITION BY NodeId ORDER BY l.AssembledTime),
对示例进行了一点修改以使用我的代码。对于每个状态,我都有一个这样的块
INSERT INTO @results (EventType, NodeId, StartSeqNo, EndSeqNo, LogCount)
SELECT 'Speed',
NodeId,
StartSeqNo=MIN(RowNumberNode),
EndSeqNo=MAX(RowNumberNode),
LogCount=MAX(RowNumberNode) - MIN(RowNumberNode) + 1
FROM
(
SELECT NodeId,
RowNumberNode,
rn=RowNumberNode-ROW_NUMBER() OVER (PARTITION BY NodeId ORDER BY RowNumberNode)
FROM @logs
WHERE StatusSpeed=1
) a
GROUP BY NodeId, rn
--HAVING MIN(RowNumberNode) - MAX(RowNumberNode) > 0
ORDER BY NodeId, StartSeqNo;
我试着用一天的数据(371444行)来运行它。一个小时后还没回来,所以我杀了它。听起来你已经找到解决办法了。不管怎样,对于371444行,我想有一个无休止的循环发生,更新中的逻辑需要修正,因为它应该在不再需要时停止更新。