Sql 从当前分区中获取第一个_值，因为阶段已更改_Sql_Google Bigquery_Data Warehouse

Sql 从当前分区中获取第一个_值，因为阶段已更改

sql google-bigquery

Sql 从当前分区中获取第一个_值，因为阶段已更改,sql,google-bigquery,data-warehouse,Sql,Google Bigquery,Data Warehouse,我想使用FIRST_VALUE函数来更新第一个_at VALUE。但我想要的是拥有第一个价值观，因为舞台已经改变了 with values as ( select 1 as deal, 2 as stage, '2020-11-10' as updated_at, '2020-11-10' as first_updated_at union all select 1 as deal, 2 as stage, '2020-11-11' as updated_at, '20

我想使用FIRST_VALUE函数来更新第一个_at VALUE。但我想要的是拥有第一个价值观，因为舞台已经改变了

with values as (

    select 1 as deal, 2 as stage, '2020-11-10' as updated_at, '2020-11-10' as first_updated_at
    union all
    select 1 as deal, 2 as stage, '2020-11-11' as updated_at, '2020-11-10' as first_updated_at 
    union all
    select 1 as deal, 3 as stage, '2020-11-12' as updated_at, '2020-11-12' as first_updated_at
    union all 
    select 1 as deal, 4 as stage, '2020-11-13' as updated_at, '2020-11-13' as first_updated_at
    union all 
    select 1 as deal, 4 as stage, '2020-11-14' as updated_at, '2020-11-13' as first_updated_at
    union all 
    select 1 as deal, 2 as stage, '2020-11-15' as updated_at, '2020-11-15' as first_updated_at
    union all 
    select 1 as deal, 2 as stage, '2020-11-16' as updated_at, '2020-11-15' as first_updated_at

)
select * from values

我尝试使用第一个值函数，如下所示：

按交易超额分配时的第一个\u值更新\u，按ASC更新的\u阶段订单

我是不是错过了什么，或者我想要的是不可能的

提前感谢

您可以使用窗口功能完成此操作。首先，查看前一阶段，看看它是否与前一行不同。然后使用累计最大值在以下情况下获取更新的_：

with values as (

    select 1 as deal, 2 as stage, '2020-11-10' as updated_at, '2020-11-10' as first_updated_at
    union all
    select 1 as deal, 2 as stage, '2020-11-11' as updated_at, '2020-11-10' as first_updated_at 
    union all
    select 1 as deal, 3 as stage, '2020-11-12' as updated_at, '2020-11-12' as first_updated_at
    union all 
    select 1 as deal, 4 as stage, '2020-11-13' as updated_at, '2020-11-13' as first_updated_at
    union all 
    select 1 as deal, 4 as stage, '2020-11-14' as updated_at, '2020-11-13' as first_updated_at
    union all 
    select 1 as deal, 2 as stage, '2020-11-15' as updated_at, '2020-11-15' as first_updated_at
    union all 
    select 1 as deal, 2 as stage, '2020-11-16' as updated_at, '2020-11-15' as first_updated_at

)
select v.*,
       max(case when stage <> prev_stage or prev_stage is null then updated_at end) over (partition by deal order by updated_at) as imputed_first_updated_at
from (select v.*,
             lag(stage) over (partition by deal order by updated_at) as prev_stage
      from values v
     ) v

我正在尝试使用第一个值函数

考虑以下选项

select * except(updated_at_on_change), 
  ifnull(updated_at_on_change, first_value(updated_at ignore nulls) over win) as first_updated_at
from (
  select *,
    if(stage != ifnull(lag(stage) over win, stage - 1), updated_at, null) updated_at_on_change
  from values
  window win as (partition by deal order by updated_at)
)
window win as (partition by deal order by updated_at desc rows between 1 following and unbounded following )
# order by updated_at

如果应用于问题中的样本数据，则输出为

. . 我假设最后一列是期望的结果。