如何在SQL Server中选择顺序重复项
我想从SQL Server表中选择重复的条目,但前提是id是连续的 我一直在努力满足我的需要,但我无法让它发挥作用 上面的答案是针对Oracle的,但我看到SQL Server还具有如何在SQL Server中选择顺序重复项,sql,sql-server,tsql,gaps-and-islands,Sql,Sql Server,Tsql,Gaps And Islands,我想从SQL Server表中选择重复的条目,但前提是id是连续的 我一直在努力满足我的需要,但我无法让它发挥作用 上面的答案是针对Oracle的,但我看到SQL Server还具有lead和lag功能 另外,我认为上面的答案在副本旁边放了一个*,但我只想选择副本 select id, companyName, case when companyName in (prev, next) then '*' end match,
lead
和lag
功能
另外,我认为上面的答案在副本旁边放了一个*
,但我只想选择副本
select
id, companyName,
case
when companyName in (prev, next)
then '*'
end match,
prev,
next
from
(select
id,
companyName,
lag(companyName, 1) over (order by id) prev,
lead(companyName, 1) over (order by id) next
from
companies)
order by
id;
示例:
因此,从这个数据集:
id companyName
-------------------
1 dogs ltd
2 cats ltd
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
7 dogs ltd
8 pigs ltd
我想选择:
id companyName
-------------------
3 pigs ltd
4 pigs ltd
5 cats ltd
6 cats ltd
更新
时不时地,我会被我得到的答案的数量和质量吓一跳。这是其中一次。我没有足够的专业知识来判断一个答案是否比另一个更好,所以我选择了SqlZim,因为这是我看到的第一个有效答案。但是很高兴看到不同的方法。尤其是一小时前,我还在想“这可能吗?” 您可以使用Row_Number()并根据partition by子句获取副本
;with cte as (
SELECT id, companyName,
RowN = Row_Number() over (partition by id order by companynae) from #yourTable
)
Select * from cte where RowN > 1
您能否提供您的输入和预期的输出来验证此查询这是一个间隙和孤岛类型的问题,但是我们在最里面的子查询中使用了
id
和行号()
,而不是使用两个行号()
。后跟count()over()
以获取每个grp
的计数,最后返回带有cnt>1的计数
select id, companyname
from (
select
id
, companyName
, grp
, cnt = count(*) over (partition by companyname, grp)
from (
select *
, grp = id - row_number() over (partition by companyname order by id)
from
companies
) islands
) d
where cnt > 1
order by id
rextester演示:
返回:
+----+-------------+
| id | companyname |
+----+-------------+
| 3 | pigs ltd |
| 4 | pigs ltd |
| 5 | cats ltd |
| 6 | cats ltd |
+----+-------------+
在WHERE子句中,您只需要将companyName限制为与上一个或下一个相同的名称
select id, companyName
from (
select id, companyName,
lag(companyName, 1) over (order by id) as prev,
lead(companyName, 1) over (order by id) as next
from companies
) q
where companyName in (prev, next)
order by id;
为了确保id确实没有间隙,您可以这样做:
select id, companyName
from (
select id, companyName,
lag(concat(id+1,companyName), 1) over (order by id) as prev,
lead(concat(id-1,companyName), 1) over (order by id) as next
from companies
) q
where concat(id,companyName) in (prev, next)
order by id;
您非常接近您想要的:
select id, companyName
from (select c.*,
lag(companyName, 1) over (order by id) prev,
lead(companyName, 1) over (order by id) next
from companies c
) a
where CompanyName in (prev, next)
order by id;
另一种形式,使用LEAD()和LAG()(SQL 2012及以上版本)
这是测试数据,您可以自己查看
CREATE TABLE #T (id int not null PRIMARY KEY, companyName varchar(16) not null)
INSERT INTO #t Values
(1, 'dogs ltd'),
(2, 'cats ltd'),
(3, 'pigs ltd'),
(4, 'pigs ltd'),
(5, 'cats ltd'),
(6, 'cats ltd'),
(7, 'dogs ltd'),
(8, 'pigs ltd')
id
之间的间隙如何,但它们之间没有行?很好。在这个例子中,只是直接连续的ID。谢谢Kannan。这是否应该拉我上面的数据集示例?尝试此操作时,目标表上没有任何行。我将设置一个更简单的表来测试它。
CREATE TABLE #T (id int not null PRIMARY KEY, companyName varchar(16) not null)
INSERT INTO #t Values
(1, 'dogs ltd'),
(2, 'cats ltd'),
(3, 'pigs ltd'),
(4, 'pigs ltd'),
(5, 'cats ltd'),
(6, 'cats ltd'),
(7, 'dogs ltd'),
(8, 'pigs ltd')