如何在SQL Server中查询超过特定计数的连续值?
我有一张桌子:如何在SQL Server中查询超过特定计数的连续值?,sql,sql-server,database,Sql,Sql Server,Database,我有一张桌子: id dummy_data 1 a 2 a 3 a 4 b 5 b 6 c 7 b 8 c 9 c 10 c 我需要查询超过某个阈值(比如2)计数的所有连续伪_数据值,结果如下: +----+------------+ | id | dummy_data | +----+------------+ | 1 | a | | 2 | c | +----+------------+ 我写了这个查询: s
id dummy_data
1 a
2 a
3 a
4 b
5 b
6 c
7 b
8 c
9 c
10 c
我需要查询超过某个阈值(比如2)计数的所有连续伪_数据值,结果如下:
+----+------------+
| id | dummy_data |
+----+------------+
| 1 | a |
| 2 | c |
+----+------------+
我写了这个查询:
select
t1.dummy_data
from
data_table t1
join data_table t2 on t1.id = t2.id + 1
join data_table t3 on t1.id = t3.id + 2
where
t1.dummy_data + ' ' + t2.dummy_data + ' ' + t3.dummy_data =
t1.dummy_data + ' ' + t1.dummy_data + ' ' + t1.dummy_data
…这让我:
+------------+
| dummy_data |
+------------+
| a |
| c |
+------------+
我的理解是,这是一种叫做缺口和孤岛的问题的变体
然而,我有两个问题:
1.我所做的显然是不可伸缩的,而且
2.我只是无法理解如何重置ID主键,以便在查询结果中重置它们
我怎样才能做到这一点?可在此处找到:
。。。但下面的模式创建只是以防万一:
create table data_table (
id int primary key,
dummy_data varchar(10)
);
insert into data_table (id, dummy_data) values
(1, 'a'),
(2, 'a'),
(3, 'a'),
(4, 'b'),
(5, 'b'),
(6, 'c'),
(7, 'b'),
(8, 'c'),
(9, 'c'),
(10, 'c');
谢谢大家! 这将容纳任意数量的连续匹配
with data as (
select *,
row_number() over (order by id) as rn,
row_number() over (partition by dummy_data order by id) as rn2
from T
)
select row_number() over (order by rn - rn2), min(dummy_data)
from data
group by rn - rn2
having count(*) >= X;
这将容纳任意数量的连续匹配
with data as (
select *,
row_number() over (order by id) as rn,
row_number() over (partition by dummy_data order by id) as rn2
from T
)
select row_number() over (order by rn - rn2), min(dummy_data)
from data
group by rn - rn2
having count(*) >= X;
调整您自己的解决方案,然后使用滞后而不是联接:
WITH SUB AS
(SELECT id,
dummy_data,
lag(dummy_data, 1) OVER (ORDER BY ID) as dd1,
lag(dummy_data, 2) OVER (ORDER BY ID) as dd2
FROM data_table)
SELECT ROW_NUMBER() OVER (ORDER BY id) AS id, dummy_data
FROM SUB
WHERE dummy_data = dd1 AND dd1 = dd2
调整您自己的解决方案,然后使用滞后而不是联接:
WITH SUB AS
(SELECT id,
dummy_data,
lag(dummy_data, 1) OVER (ORDER BY ID) as dd1,
lag(dummy_data, 2) OVER (ORDER BY ID) as dd2
FROM data_table)
SELECT ROW_NUMBER() OVER (ORDER BY id) AS id, dummy_data
FROM SUB
WHERE dummy_data = dd1 AND dd1 = dd2
您可以使用滞后或超前。对于连续三个:
select min(id), dummy_data
from (select t.*,
lag(id, 2) over (order by id) as prev_id,
lag(id, 2) over (partition by dummy_data order by id) as prev_id_dd
from t
) t
where prev_id = prev_id_dd
group by dummy_data;
逻辑很简单。峰背n-1行。然后,峰值返回具有相同伪_数据值的n-1行。如果这些值相同,则所有中间行的值都相同。您可以使用滞后或超前。对于连续三个:
select min(id), dummy_data
from (select t.*,
lag(id, 2) over (order by id) as prev_id,
lag(id, 2) over (partition by dummy_data order by id) as prev_id_dd
from t
) t
where prev_id = prev_id_dd
group by dummy_data;
逻辑很简单。峰背n-1行。然后,峰值返回具有相同伪_数据值的n-1行。如果这些值相同,则所有中间行的值都相同。谢谢。我收到错误列“data.dummy_data”在选择列表中无效,因为它不包含在聚合函数或GROUP BY子句中。这意味着什么?需要汇总。谢谢。我收到错误列“data.dummy_data”在选择列表中无效,因为它不包含在聚合函数或GROUP BY子句中。这意味着什么?它需要聚合。谢谢,我被from语法中的t变量弄糊涂了,这是代表表的吗?你能告诉我为什么需要它吗?@saktas001。这是问题中未指定的表名。注意:这比您目前接受的解决方案更通用。将其扩展到任意n值都很简单。谢谢,我被from语法中的t变量搞糊涂了,这是表的意思吗?你能告诉我为什么需要它吗?@saktas001。这是问题中未指定的表名。注意:这比您目前接受的解决方案更通用。将其扩展到n的任何值都很简单。