Oracle 11.2 SQL-帮助压缩有序集中的数据

Oracle 11.2 SQL-帮助压缩有序集中的数据,sql,oracle,gaps-and-islands,Sql,Oracle,Gaps And Islands,我有一个带有时间戳列和多个标识符列的数据集。当按时间戳排序时,我想为每个相邻行块压缩一行,每个行具有相同的标识符。需要每个块的最小和最大时间戳 来源数据: TSTAMP ID1 ID2 t1 A B <= start of new block t2 A B t3 C D <= start of new block t4 E F <= start of new block t5 E F t6

我有一个带有时间戳列和多个标识符列的数据集。当按时间戳排序时,我想为每个相邻行块压缩一行,每个行具有相同的标识符。需要每个块的最小和最大时间戳

来源数据:

TSTAMP  ID1  ID2
t1      A    B  <= start of new block
t2      A    B
t3      C    D  <= start of new block
t4      E    F  <= start of new block
t5      E    F
t6      E    F
t7      A    B  <= start of new block
t8      G    H  <= start of new block
我认为这对于窗口分析函数来说已经成熟了,但如果不将IDn的所有相等组合分组,我就无法进行分区,而不仅仅是按时间戳排序的相邻行中的组合

一种解决方法是首先在内嵌视图中创建一个键列,稍后我可以根据该键列进行分组,即块中的每一行具有相同的值,而每个块具有不同的值。我可以使用LAG分析函数来比较行值,然后调用PL/SQL函数来返回直接在SQL中调用nextval/currval的序列的nextval/currval值,这在上下文中是受限制的

select min(ilv.tstamp), max(ilv.tstamp), id1, id2
from (
  select case when (id1 != lag(id1,1,'*') over (partition by (1) order by tstamp) 
                 or id2 != lag(id2,1,'*') over (partition by (1) order by tstamp))
           then
             pk_seq_utils.gav_get_nextval
           else
             pk_seq_utils.gav_get_currval
           end ident, t.*
  from tab1 t
  order by tstamp) ilv
group by ident, id1, id2
order by 1;
其中gav_get_xxx函数只是从序列返回currval/nextval

但我只想使用SQL,避免使用PL/SQL,因为我也可以在PL/SQL中轻松编写,并通过管道函数输出结果行

有什么想法吗


谢谢。

您需要一步一步地执行此操作:

检测ID更改,并用标志=1标记每个更改。 为组生成密钥,即具有相同ID的相邻记录,并在ID更改标志上运行总和。 按生成的组密钥分组并获取最小/最大时间戳。 查询:

select 
  min(tstamp) as min_tstamp,
  max(tstamp) as max_tstamp,
  min(id1) as id1,
  min(id2) as id2
from
(
  select 
    grouped.*, 
    sum(newgroup) over (order by tstamp) as groupkey
  from
  (
    select 
      mytable.*, 
      case when id1 <> lag(id1) over (order by tstamp) 
             or id2 <> lag(id2) over (order by tstamp) 
      then 1 else 0 end as newgroup 
    from mytable
    order by tstamp
  ) grouped
)
group by groupkey
order by groupkey;

您需要一步一步地执行此操作:

检测ID更改,并用标志=1标记每个更改。 为组生成密钥,即具有相同ID的相邻记录,并在ID更改标志上运行总和。 按生成的组密钥分组并获取最小/最大时间戳。 查询:

select 
  min(tstamp) as min_tstamp,
  max(tstamp) as max_tstamp,
  min(id1) as id1,
  min(id2) as id2
from
(
  select 
    grouped.*, 
    sum(newgroup) over (order by tstamp) as groupkey
  from
  (
    select 
      mytable.*, 
      case when id1 <> lag(id1) over (order by tstamp) 
             or id2 <> lag(id2) over (order by tstamp) 
      then 1 else 0 end as newgroup 
    from mytable
    order by tstamp
  ) grouped
)
group by groupkey
order by groupkey;
救命啊

with sample_data as (select 't1' tstamp, 'A' id1, 'B' id2 from dual union all
                     select 't2' tstamp, 'A' id1, 'B' id2 from dual union all
                     select 't3' tstamp, 'C' id1, 'D' id2 from dual union all
                     select 't4' tstamp, 'E' id1, 'F' id2 from dual union all
                     select 't5' tstamp, 'E' id1, 'F' id2 from dual union all
                     select 't6' tstamp, 'E' id1, 'F' id2 from dual union all
                     select 't7' tstamp, 'A' id1, 'B' id2 from dual union all
                     select 't8' tstamp, 'G' id1, 'H' id2 from dual)
select   min(tstamp) min_tstamp, max(tstamp) max_tstamp, id1, id2
from     (select tstamp,
                 id1,
                 id2,
                 row_number() over (order by tstamp) - row_number() over (partition by id1, id2 order by tstamp) grp
          from   sample_data)
group by id1,
         id2,
         grp
order by min(tstamp);

MIN_TSTAMP MAX_TSTAMP ID1 ID2
---------- ---------- --- ---
t1         t2         A   B  
t3         t3         C   D  
t4         t6         E   F  
t7         t7         A   B  
t8         t8         G   H  
救命啊

with sample_data as (select 't1' tstamp, 'A' id1, 'B' id2 from dual union all
                     select 't2' tstamp, 'A' id1, 'B' id2 from dual union all
                     select 't3' tstamp, 'C' id1, 'D' id2 from dual union all
                     select 't4' tstamp, 'E' id1, 'F' id2 from dual union all
                     select 't5' tstamp, 'E' id1, 'F' id2 from dual union all
                     select 't6' tstamp, 'E' id1, 'F' id2 from dual union all
                     select 't7' tstamp, 'A' id1, 'B' id2 from dual union all
                     select 't8' tstamp, 'G' id1, 'H' id2 from dual)
select   min(tstamp) min_tstamp, max(tstamp) max_tstamp, id1, id2
from     (select tstamp,
                 id1,
                 id2,
                 row_number() over (order by tstamp) - row_number() over (partition by id1, id2 order by tstamp) grp
          from   sample_data)
group by id1,
         id2,
         grp
order by min(tstamp);

MIN_TSTAMP MAX_TSTAMP ID1 ID2
---------- ---------- --- ---
t1         t2         A   B  
t3         t3         C   D  
t4         t6         E   F  
t7         t7         A   B  
t8         t8         G   H  

您应该能够使用row_number窗口函数执行此操作,如下所示:

select 
    min(tstamp) mints, max(tstamp) maxts, id1, id2
from (
    select 
       *, 
       row_number() over (order by tstamp) 
     - row_number() over (partition by id1, id2 order by tstamp) as rn
    from t
) as subq
group by id1, id2, rn
order by rn

我还不能用任何Oracle db测试它,但它可以与MSSQL一起使用,也可以在Oracle中使用,因为窗口函数的工作方式是相同的。

您应该可以使用行数窗口函数来完成此操作,如下所示:

select 
    min(tstamp) mints, max(tstamp) maxts, id1, id2
from (
    select 
       *, 
       row_number() over (order by tstamp) 
     - row_number() over (partition by id1, id2 order by tstamp) as rn
    from t
) as subq
group by id1, id2, rn
order by rn
我还不能用任何Oracle db对其进行测试,但它可以与MSSQL一起使用,而且应该也可以在Oracle中使用,因为窗口函数的工作方式是相同的。

您可以使用它来识别间隙和孤岛,将每行的位置与所有行的tstamp进行比较,并将其位置与该id2的tstamp进行比较,id2组合:

select tstamp, id1, id2,
  row_number() over (partition by id1, id2 order by tstamp)
    - row_number() over (order by tstamp) as block_id
from tab1;

TS I I   BLOCK_ID
-- - - ----------
t1 A B          0
t2 A B          0
t3 C D         -2
t4 E F         -3
t5 E F         -3
t6 E F         -3
t7 A B         -4
t8 G H         -7
block_id的实际值并不重要,只是它对于组合的每个块都是唯一的。然后,您可以使用以下方法进行分组:

select min(tstamp) as min_tstamp, max(tstamp) as max_tstamp, id1, id2
from (
  select tstamp, id1, id2,
    row_number() over (partition by id1, id2 order by tstamp)
      - row_number() over (order by tstamp) as block_id
  from tab1
)
group by id1, id2, block_id
order by min(tstamp);

MI MA I I
-- -- - -
t1 t2 A B
t3 t3 C D
t4 t6 E F
t7 t7 A B
t8 t8 G H
您可以使用来识别间隙和孤岛,将所有行中每行的位置与tstamp进行比较,并将其位置与该id2、id2组合的tstamp进行比较:

select tstamp, id1, id2,
  row_number() over (partition by id1, id2 order by tstamp)
    - row_number() over (order by tstamp) as block_id
from tab1;

TS I I   BLOCK_ID
-- - - ----------
t1 A B          0
t2 A B          0
t3 C D         -2
t4 E F         -3
t5 E F         -3
t6 E F         -3
t7 A B         -4
t8 G H         -7
block_id的实际值并不重要,只是它对于组合的每个块都是唯一的。然后,您可以使用以下方法进行分组:

select min(tstamp) as min_tstamp, max(tstamp) as max_tstamp, id1, id2
from (
  select tstamp, id1, id2,
    row_number() over (partition by id1, id2 order by tstamp)
      - row_number() over (order by tstamp) as block_id
  from tab1
)
group by id1, id2, block_id
order by min(tstamp);

MI MA I I
-- -- - -
t1 t2 A B
t3 t3 C D
t4 t6 E F
t7 t7 A B
t8 t8 G H

太棒了-我没有想到用SUM来运行总体方法。谢谢托尔斯滕。嗯,显然没那么聪明;-其他答案显示了一种更加优雅的方式。我甚至在过去使用过这种技术,但我没有想到。太棒了,我没有想到用SUM来运行总体方法。谢谢托尔斯滕。嗯,显然没那么聪明;-其他答案显示了一种更加优雅的方式。我以前甚至用过这种技术,但我没想到。该死!你很快。Tabibitosan是解决这个问题的合适方法。@LalitKumarB Tabibitosan rocks*{:-DDamn!你太快了。塔比比托桑是解决这个问题的合适方法。@LalitKumarB塔比比托桑岩石!*{:-DHave-Have-Have-Have-ahea-ahea-ahea