SQL-使用LEAD跳过具有特定条件的行

SQL-使用LEAD跳过具有特定条件的行,sql,lead,google-bigquery,Sql,Lead,Google Bigquery,在Google BigQuery中使用标准SQL 我有一张有两种订单类型的表:a和B Id | Type | OrderDate ----------------- 1 | A | 2019-03-01 2 | B | 2019-03-04 3 | B | 2019-03-04 4 | A | 2019-03-05 5 | A | 2019-03-06 6 | B | 2019-04-05 对于每个订单类型A,我想计算出下一个订单B是什么时候,忽

在Google BigQuery中使用标准SQL

我有一张有两种订单类型的表:a和B

Id | Type | OrderDate
-----------------
1  |  A   | 2019-03-01
2  |  B   | 2019-03-04
3  |  B   | 2019-03-04
4  |  A   | 2019-03-05
5  |  A   | 2019-03-06
6  |  B   | 2019-04-05
对于每个订单类型A,我想计算出下一个订单B是什么时候,忽略所有其他订单A

在我的示例数据中,如果我想返回以下内容:

Id | Type | NextOrderBDate
--------------------------------
1  |  A   | 2019-03-04
4  |  A   | 2019-04-05
5  |  A   | 2019-04-05
通过将A和B的两个单独的表相互连接,我确实非常低效地获得了结果——但是数据集非常大,运行起来需要一个多小时

我现在想做的是使用一个LEAD语句,如下所示:

SELECT Id, Type,
    LEAD(OrderDate) OVER (PARTITION BY Id ORDER BY OrderDate)
FROM xxx
显然,这里的问题是,无论订单类型如何,它都将返回下一个日期

我想知道这样做的关键是不是要计算出正确的偏移量,每一行都需要导致下一个B型订单,我正在努力找到一个(干净的)解决方案


提前感谢。

您可以按如下方式使用内联查询:

select
    id,
    type,
    (
        select min(OrderDate) 
        from mytable t1 
        where t1.Type = 'B' and t1.OrderDate >= t.OrderDate
    ) NextOrderBDate
from mytable t
where type = 'A'

id | type | NextOrderBDate -: | :--- | :------------- 1 | A | 2019-03-04 4 | A | 2019-04-05 5 | A | 2019-04-05 id | type | NextOrderBDate -: | :--- | :------------- 1 | A | 2019-03-04 4 | A | 2019-04-05 5 | A | 2019-04-05
您可以按如下方式使用内联查询:

select
    id,
    type,
    (
        select min(OrderDate) 
        from mytable t1 
        where t1.Type = 'B' and t1.OrderDate >= t.OrderDate
    ) NextOrderBDate
from mytable t
where type = 'A'

id | type | NextOrderBDate -: | :--- | :------------- 1 | A | 2019-03-04 4 | A | 2019-04-05 5 | A | 2019-04-05 id | type | NextOrderBDate -: | :--- | :------------- 1 | A | 2019-03-04 4 | A | 2019-04-05 5 | A | 2019-04-05
只需使用累积分钟数:

select t.*
from (select t.*,
             min(case when type = 'B' then orderdate end) over (order by orderdate) as next_b_orderdate
      from t
     ) t
where type = 'A';

只需使用累积分钟数:

select t.*
from (select t.*,
             min(case when type = 'B' then orderdate end) over (order by orderdate) as next_b_orderdate
      from t
     ) t
where type = 'A';

@Gordon Linoff是正确的,除了一个小错误:下一个B订单应该与每个当前订单相关。因此,应适当调整查询窗口:

with t (id, type, orderdate) as (
  select 1  ,  'A'   , date '2019-03-01' union
  select 2  ,  'B'   , date '2019-03-04' union
  select 3  ,  'B'   , date '2019-03-04' union
  select 4  ,  'A'   , date '2019-03-05' union
  select 5  ,  'A'   , date '2019-03-06' union
  select 6  ,  'B'   , date '2019-04-05'
)
select t.*
from (select t.*,
             min(case when type = 'B' then orderdate end)
             over (order by orderdate 
                 rows between current row and unbounded following
             ) as next_b_orderdate
      from t
     ) t
where type = 'A';

@Gordon Linoff是正确的,除了一个小错误:下一个B订单应该与每个当前订单相关。因此,应适当调整查询窗口:

with t (id, type, orderdate) as (
  select 1  ,  'A'   , date '2019-03-01' union
  select 2  ,  'B'   , date '2019-03-04' union
  select 3  ,  'B'   , date '2019-03-04' union
  select 4  ,  'A'   , date '2019-03-05' union
  select 5  ,  'A'   , date '2019-03-06' union
  select 6  ,  'B'   , date '2019-04-05'
)
select t.*
from (select t.*,
             min(case when type = 'B' then orderdate end)
             over (order by orderdate 
                 rows between current row and unbounded following
             ) as next_b_orderdate
      from t
     ) t
where type = 'A';