Sql 如何识别在一年内彼此之间具有等于1的特定标志的记录?
我有位置的数据,包括位置ID和一组3个0或1的标志,这些标志指示位置的纬度、经度或地址是否已更改,以及发生更改的月末 所以我在看这样的东西:Sql 如何识别在一年内彼此之间具有等于1的特定标志的记录?,sql,sql-server,Sql,Sql Server,我有位置的数据,包括位置ID和一组3个0或1的标志,这些标志指示位置的纬度、经度或地址是否已更改,以及发生更改的月末 所以我在看这样的东西: +------------+-------------+--------------+---------------+---------------------+ | LOCATIONID | XCOORDHANGE | YCOORDCHANGE | ADDRESSCHANGE | REPORTPERIOD | +------------+-
+------------+-------------+--------------+---------------+---------------------+
| LOCATIONID | XCOORDHANGE | YCOORDCHANGE | ADDRESSCHANGE | REPORTPERIOD |
+------------+-------------+--------------+---------------+---------------------+
| 1 | 0 | 0 | 1 | 2010-01-31 00:00:00 |
+------------+-------------+--------------+---------------+---------------------+
| 2 | 1 | 1 | 1 | 2010-03-31 00:00:00 |
+------------+-------------+--------------+---------------+---------------------+
| 1 | 1 | 1 | 0 | 2010-08-31 00:00:00 |
+------------+-------------+--------------+---------------+---------------------+
我的任务是确定已经移动的位置。移动被定义为x或y坐标变化和地址变化(有时位置被重新定位,坐标变化,但地址不变,有时地址变化,但没有后续坐标变化,我对这些站点不感兴趣)
当所有3个标志都设置为1时,识别就足够容易了。问题是地址和坐标的变化并不总是同时发生。例如,位置1显示了2010年1月31日的地址更改,但坐标在2010年8月31日更改。我需要查看每个记录,并确定在第一次更改后的一年内是否满足“移动”标准。对于我上面的例子中的位置1,如果X和/或Y坐标变化从地址变化到1年(也就是说,在1年内满足标准),我会认为它是一个“移动”。另一个问题是,在我调查的4年内,一个位置可能会移动多次。我将在2010年1月31日至2014年12月31日期间执行此操作
我的第一次尝试是使用ROW_NUMBER()(按LOCATIONID顺序按REPORTPERIOD ASC划分)作为rn
,并在a.rn=a.rn+1
上使用自连接将一条记录链接到另一条记录,但这会忽略已移动多次的位置
最终目标是添加一列MEETSREQ
,该列将是一个位
,带有一个1,表示该位置发生了坐标更改和地址更改,并且这些更改发生在彼此的1年内
输出将如下所示
+------------+-------------+--------------+---------------+---------------------+---------+
| LOCATIONID | XCOORDHANGE | YCOORDCHANGE | ADDRESSCHANGE | REPORTPERIOD | MEETREQ |
+------------+-------------+--------------+---------------+---------------------+---------+
| 1 | 0 | 0 | 1 | 2010-01-31 00:00:00 | 1 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 2 | 1 | 1 | 1 | 2010-03-31 00:00:00 | 1 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 1 | 1 | 1 | 0 | 2010-08-31 00:00:00 | 0 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 3 | 0 | 0 | 1 | 2011-02-28 00:00:00 | 0 |
+------------+-------------+--------------+---------------+---------------------+---------+
| 4 | 1 | 1 | 0 | 2011-03-31 00:00:00 | 0 |
+------------+-------------+--------------+---------------+---------------------+---------+
这是SQLServer2008R2。谢谢你的时间,我希望我已经补充了足够的清晰度。如有必要,我可以提供其他详细信息。您可以这样做。注意,尽管它是一个“邪恶”光标。我个人认为,当您执行复杂的业务逻辑时,它可以保持事情的清晰
DECLARE @LOCATIONID INT
DECLARE @XCOORDHANGE INT
DECLARE @YCOORDCHANGE INT
DECLARE @ADDRESSCHANGE INT
DECLARE @REPORTPERIOD DATETIME
CREATE TABLE #Temp1 ( LOCATIONID INT, HASMOVED BIT );
-- find all locations that have an address change
DECLARE db_cursor CURSOR FOR
SELECT LOCATIONID, XCOORDHANGE, YCOORDCHANGE, ADDRESSCHANGE, REPORTPERIOD
FROM [TABLENAME]
WHERE ADDRESSCHANGE = 1
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO @LOCATIONID, @XCOORDHANGE, @YCOORDCHANGE, @ADDRESSCHANGE, @REPORTPERIOD
WHILE @@FETCH_STATUS = 0
BEGIN
-- find any other occurance of this location within the previous year, excluding any we've already looked at
-- and must have an x or y coord change
IF EXISTS(SELECT 0 FROM [TABLENAME] WHERE LOCATIONID = @LOCATIONID
AND LOCATIONID NOT IN(SELECT LOCATIONID FROM #Temp1)
AND (XCOORDHANGE = 1 OR YCOORDCHANGE = 1)
AND REPORTPERIOD > DATEADD(year, 1, @REPORTPERIOD)
)
INSERT INTO #Temp1 (LOCATIONID, HASMOVED) VALUES (@LOCATIONID, 1)
ELSE
INSERT INTO #Temp1 (LOCATIONID, HASMOVED) VALUES (@LOCATIONID, 0)
FETCH NEXT FROM db_cursor INTO @LOCATIONID, @XCOORDHANGE, @YCOORDCHANGE, @ADDRESSCHANGE, @REPORTPERIOD
END
CLOSE db_cursor
DEALLOCATE db_cursor
SELECT LOCATIONID, HASMOVED FROM #Temp1
如果愿意,您可以在末尾加入现有的[TABLENAME],这将为您提供包含HasMoved列的现有表
这可能不是您指定的确切逻辑,但它应该能让您大致了解我建议的方法。因为您只关心(x或y)和地址在一段时间内不为0,所以我在内部查询中对它们使用了
SUM
SELECT LocationID
,SUM(Xcoord) AS x
,SUM(Ycoord) AS y
,SUM(Address) AS a
FROM myTable
WHERE Period BETWEEN '2010-01-01' AND '2010-12-31'
GROUP BY LocationID
然后在外部查询中使用CASE
包含一列
SELECT LocationID
,(CASE WHEN (x > 0 OR y > 0) AND a > 0 THEN 1 ELSE 0 END) AS MeetsReq
FROM (
SELECT LocationID
,SUM(Xcoord) AS x
,SUM(Ycoord) AS y
,SUM(Address) AS a
FROM myTable
WHERE Period BETWEEN '2010-01-01' AND '2010-12-31'
GROUP BY LocationID
) AS isrc
然后从基表中选择,左键连接到子查询
将MeetsReq的空值更改为0
/* This is the final query.
The 2 queries above are included here,
and was just separated for explanation purposes */
SELECT main.*, COALESCE(src.MeetsReq, 0) AS MeetsReq
FROM myTable AS main
LEFT OUTER JOIN (
SELECT LocationID
,(CASE WHEN (x > 0 OR y > 0) AND a > 0 THEN 1 ELSE 0 END) AS MeetsReq
FROM (
SELECT LocationID
,SUM(Xcoord) AS x
,SUM(Ycoord) AS y
,SUM(Address) AS a
FROM myTable
WHERE Period BETWEEN '2010-01-01' AND '2010-12-31'
GROUP BY LocationID
) AS isrc
) AS src ON main.LocationID = src.LocationID
虽然如果某个位置在MeetsReq上标记为1,则该位置的所有记录都将相同。在SQL Server 2012+中,您可以使用
LEAD()
执行此操作
您是否要求输出仍然是按位置日期排列的多条记录?理想情况下,输出仍然是在确认移动的最早日期将
MEETREQ
标志设置为1的所有记录。如果您可以使用带有由多个查询填充的临时表的游标或存储过程,则可能最简单。我还发现,当需要实现复杂的业务逻辑时,管理起来更容易。是的,我曾想过游标的概念,但有人告诉我它是糟糕的sql,所以我不确定是否应该实现它。如果你能对这个想法进行一点扩展,我会很感激,我并不反对这条路线,两者都是可用的。游标可能会被过度使用,并且没有最好的性能,但它们不是邪恶的——它们有目的。它们可能非常有用,特别是如果您想执行逐行逻辑的话。临时表方法可能是一种更快的方法。表中有多少行,这个查询多久运行一次?我确实需要改变一下逻辑,但它似乎可以工作,谢谢!
select t.*,
(case when lead(addresschange) over (partition by locationid order by order by reportperiod) <> addresschange and
(lead(xcoordchange) over (partition by locationid order by reportperiod) <> xcoordchange or
lead(ycoordchange) over (partition by locationid order by reportperiod) <> ycoordchange
)
then 0
else 1
end) as meetreq
from t;
select t.*
(case when tnext.addresschange <> addresschange and
(tnext.xcoordchange <> xcoordchange or
tnext.ycoordchange <> ycoordchange
)
then 0
else 1
end) as meetreq
from t outer apply
(select top 1 t2.*
from t t2
where t2.locationid = t.locationid and t2.reportperiod > t.reportperiod
order by t2.reportperiod asc
) tnext;