Sql 看看结果 如果他们经常参与,比如说,通常在大约一半的事件中,我们希望迭代事件并在找到连续事件后停止,而不是继续阅读

Sql 看看结果 如果他们经常参与,比如说,通常在大约一半的事件中,我们希望迭代事件并在找到连续事件后停止,而不是继续阅读,sql,oracle,subquery,window-functions,gaps-and-islands,Sql,Oracle,Subquery,Window Functions,Gaps And Islands,这里是对第二种方法的查询。我们使用递归查询(因为这是我们在SQL中应用迭代过程的方式)。我们从所有运动员和第一次约会开始。然后我们进入第二次约会,为所有参加这两次约会的人停下来。剩下的时间我们看第三次约会,再次为参加第二次和第三次约会的人停下来。等等 应该有一个关于日期和运动员的索引,以便快速查找结果行。我甚至会提供两个索引,因为我不知道哪一列更有选择性。因此,让DBMS来决定 create index idx2 on result (hold_date, sportsman_id); crea

这里是对第二种方法的查询。我们使用递归查询(因为这是我们在SQL中应用迭代过程的方式)。我们从所有运动员和第一次约会开始。然后我们进入第二次约会,为所有参加这两次约会的人停下来。剩下的时间我们看第三次约会,再次为参加第二次和第三次约会的人停下来。等等

应该有一个关于日期和运动员的索引,以便快速查找结果行。我甚至会提供两个索引,因为我不知道哪一列更有选择性。因此,让DBMS来决定

create index idx2 on result (hold_date, sportsman_id);
create index idx3 on result (sportsman_id, hold_date);
以下是查询:

with dates as 
(
  select
    hold_date,
    lead(hold_date) over (order by hold_date) as next_date,
    min(hold_date) over (order by hold_date) as min_date
  from (select distinct hold_date from result)
)
, cte (sportsman_id, sportsman_name, rank, year_of_birth, personal_record, country,
       hold_date, next_date, was_in, is_in) as
(
  select
    s.sportsman_id, s.sportsman_name, s.rank, s.year_of_birth,
    s.personal_record, s.country, d.hold_date, d.next_date, 'NO',
    case when r.hold_date is not null then 'YES' else 'NO' end
  from sportsman s
  cross join (select * from dates where hold_date = min_date) d
  left join result r on r.sportsman_id = s.sportsman_id
                     and r.hold_date = d.hold_date
  union all
  select
    s.sportsman_id, s.sportsman_name, s.rank, s.year_of_birth,
    s.personal_record, s.country, d.hold_date, d.next_date, s.is_in,
    case when r.hold_date is not null then 'YES' else 'NO' end
  from cte s
  join dates d on d.hold_date = s.next_date
  left join result r on r.sportsman_id = s.sportsman_id
                     and r.hold_date = d.hold_date
  where not (s.was_in = 'YES' and s.is_in = 'YES')
)
select sportsman_id, sportsman_name, rank, year_of_birth, personal_record, country
from cte
where was_in = 'YES' and is_in = 'YES';

示例数据和期望的结果将是一个很大的帮助。添加为屏幕快照您的数据模型显示一场比赛可以跨越几天(因此结果表中的日期,否则将在比赛表中)。这是否也意味着两项比赛可以重叠?我能在9月5日和6日找到一场比赛,在9月4日和7日找到另一场吗?如果是这样,如何进行?请将您的问题包括样本数据、注释代码和预期结果作为文本(对于代码,最好是DDL/DML语句,我们可以复制/粘贴)。@ThorstenKettner如前所述:每个比赛id有一个举行日期。没有重叠;所以实际上举行的日期是独一无二的。这是否取决于比赛间隔正好1个月?您似乎复制了我的样本数据(没有引用),并以此为基础给出了答案。虽然它适用于我的数据,但OP的数据没有这个属性。啊,是的,我假设你使用了OP提供的相同数据,也许没有。这不是一个很难做出的改变,等等……最后一部分看起来很平滑,它很有效,而且你似乎也在基于性能来使用它。非常感谢。稍后将阅读文章。@AndrewSayer Tho我还不明白这个/count(*)over(按运动员id划分,岛屿)comps_in_岛/。好像我明白了。我的代码的薄弱部分是什么?count(*)over(按sportsman_id划分,island)提供匹配sportsman_id和island的行数。您可以在不同的子查询级别上运行查询,以查看发生了什么。您的代码不一定很弱,但它需要将您的大表与其自身连接起来,您可以通过使用稠密_秩分析函数实现我所做的相同优化来限制这一点。检查该运动员id的前一个比赛号码是否为全球前一个比赛号码的延迟可以正常工作,但您可能会发现扩展以获取跑步的其他信息是困难的。运动员id 5不应该算作该数据,它存在于比赛1和比赛2中。@Andrewayer OP对此不清楚;运动员
5
参加了1、2和5级比赛,因此他们都符合标准,也不符合标准。我错误地认为,如果他们有任何比赛,而他们没有参加之前或之后的比赛,那么他们将被排除在结果之外。但是,如果它们应该包括在内,那么将最后一行中的
MIN
更改为
MAX
就很简单了。@void\u eater:您在查询中遇到了什么问题?over(按运动员id划分,按比赛id排序)-“,”,oRder,sportman\u id。然后/ORA-00904:“竞争ID”:无效标识符/@void\u eater:我的坏。我们需要首先在子查询中选择该列,以便在外部查询中使用它。修正了。谢谢你在我学习的过程中汇报。但它是一个测试表/关系,我猜它的创建者并不太关心它。但老实说,安德鲁·塞耶和“岛屿”概念的答案似乎更容易理解,也更有可能更有效(不确定)。是的,可能是这样。如前所述,我的建议仅适用于你期望运动员参加许多赛事的情况。如果几乎所有运动员都参加了几乎所有的比赛,而你只想筛选出很少几个没有参加任何连续比赛的运动员,那么最好只查找每个运动员的几个比赛项目,直到你找到第一个连续比赛项目,而不是浏览所有数据。 | SPORTSMAN_ID | | -----------: | | 1 | | 2 | | SPORTSMAN_ID | | -----------: | | 1 | | 2 | | 5 | SPORTSMAN_ID | FIRST_COMPETITION_ID | FIRST_HOLD_DATE | LAST_COMPETITION_ID | LAST_HOLD_DATE -----------: | -------------------: | :------------------ | ------------------: | :------------------ 1 | 1 | 2020-01-01 00:00:00 | 5 | 2020-05-01 00:00:00 2 | 1 | 2020-01-01 00:00:00 | 2 | 2020-02-01 00:00:00 2 | 4 | 2020-04-01 00:00:00 | 5 | 2020-05-01 00:00:00 5 | 1 | 2020-01-01 00:00:00 | 2 | 2020-02-01 00:00:00
select sportsman_id, min(hold_date) , max(hold_date), comps_in_island
from (
 select  competition_id, sportsman_id, hold_date, island, count(*) over (partition by sportsman_id,island) comps_in_island
 from (
  select  competition_id, sportsman_id, hold_date , add_months(hold_date,-1*row_number() over(partition by sportsman_id order by hold_date)) island
  from    result
 )
)
where comps_in_island > 1
group by sportsman_id, island, comps_in_island;
select sportsman_id, min(competition_id) , max(competition_id), comps_in_island
from (
 select  competition_id, sportsman_id, hold_date, island, count(*) over (partition by sportsman_id,island) comps_in_island
 from 
  select  competition_id, sportsman_id, hold_date , competition_id -row_number() over(partition by sportsman_id order by competition_id)) island
  from    result
 )
)
where comps_in_island > 1
group by sportsman_id, island, comps_in_island;
select sportsman_id, min(competition_id) , max(competition_id), comps_in_island
from (
 select  competition_id, sportsman_id, hold_date, island, count(*) over (partition by sportsman_id,island) comps_in_island
 from (
  select  competition_id, sportsman_id, hold_date , comp_number -row_number() over(partition by sportsman_id order by comp_number) island
  from (  
   select  competition_id, sportsman_id, hold_date , dense_rank() over (partition by null order by competition_id) comp_number
   from    result
  )
 )
)
where comps_in_island > 1
group by sportsman_id, island, comps_in_island;
select distinct sportman_id
from (
    select sportman_id, competition_id
        lag(competition_id) over(partition by sportman_id, oder by competition_id) lag_competition_id
    from result r
) r
where competition_id = lag_competition_id + 1
select s.*
from sportman s
where exists (
    select 1
    from (
        select sportman_id, competition_id
            lag(competition_id) over(partition by sportman_id, oder by competition_id) lag_competition_id
        from result r
    ) r
    where r.competition_id = r.lag_competition_id + 1 and r.sportman_id = s.sportman_id
)
select distinct hold_date
from result
order by hold_date;
create index idx1 on result (hold_date);
create index idx2 on result (hold_date, sportsman_id);
create index idx3 on result (sportsman_id, hold_date);
with dates as 
(
  select
    hold_date,
    lead(hold_date) over (order by hold_date) as next_date,
    min(hold_date) over (order by hold_date) as min_date
  from (select distinct hold_date from result)
)
, cte (sportsman_id, sportsman_name, rank, year_of_birth, personal_record, country,
       hold_date, next_date, was_in, is_in) as
(
  select
    s.sportsman_id, s.sportsman_name, s.rank, s.year_of_birth,
    s.personal_record, s.country, d.hold_date, d.next_date, 'NO',
    case when r.hold_date is not null then 'YES' else 'NO' end
  from sportsman s
  cross join (select * from dates where hold_date = min_date) d
  left join result r on r.sportsman_id = s.sportsman_id
                     and r.hold_date = d.hold_date
  union all
  select
    s.sportsman_id, s.sportsman_name, s.rank, s.year_of_birth,
    s.personal_record, s.country, d.hold_date, d.next_date, s.is_in,
    case when r.hold_date is not null then 'YES' else 'NO' end
  from cte s
  join dates d on d.hold_date = s.next_date
  left join result r on r.sportsman_id = s.sportsman_id
                     and r.hold_date = d.hold_date
  where not (s.was_in = 'YES' and s.is_in = 'YES')
)
select sportsman_id, sportsman_name, rank, year_of_birth, personal_record, country
from cte
where was_in = 'YES' and is_in = 'YES';