Snowflake cloud data platform 雪花:窗口功能';范围';不支持,如何查询?

Snowflake cloud data platform 雪花:窗口功能';范围';不支持,如何查询?,snowflake-cloud-data-platform,Snowflake Cloud Data Platform,我有一个交易表,包括txn_日期和客户id 对于每个在12月份有交易的客户,我想知道该客户在给定交易之前的90天内有多少交易 这似乎是一个可以使用窗口函数和范围滑动窗口运行的查询,但Snowflake不支持范围滑动窗口框架 如何在Snowflake中运行此查询?类似这样的内容如何: WITH T1 AS ( SELECT CUSTOMER_ID, TX_DATE FROM TRANSACTIONS WHERE TX_DATE BETWEEN '2020-12-01' AN

我有一个交易表,包括txn_日期和客户id

对于每个在12月份有交易的客户,我想知道该客户在给定交易之前的90天内有多少交易

这似乎是一个可以使用窗口函数和范围滑动窗口运行的查询,但Snowflake不支持范围滑动窗口框架


如何在Snowflake中运行此查询?

类似这样的内容如何:

WITH T1 AS (
    SELECT CUSTOMER_ID, TX_DATE
    FROM TRANSACTIONS
    WHERE TX_DATE BETWEEN '2020-12-01' AND '2020-12-31')
SELECT T2.CUSTOMER_ID, T2.TX_DATE
FROM TRANSACTIONS T2
INNER JOIN T1 ON T2.CUSTOMER_ID = T2.CUSTOMER_ID
WHERE T2.TX_DATE BETWEEN (T1.TX_DATE - 90) AND T1.TX_DATE

一开始,NickW的答案也是如此

WITH data AS (
    SELECT txn_date::timestamp_ntz as txn_date, cust_id, txn_id
    FROM VALUES
        ('2020-12-04',0, 0),
        ('2020-12-03',1, 1),
        ('2020-11-04',1, 2),
        ('2020-10-04',1, 3),
        ('2020-09-04',1, 4), -- just on 90 days
        ('2020-09-02',1, 5), -- too far
        ('2021-01-05',1, 6)  -- in the future
        v(txn_date , cust_id, txn_id)
), dec_txn AS (
    SELECT txn_id,
        cust_id,
        DATEADD('day',-90, txn_date) AS win_start,
        txn_date AS win_end
    FROM data 
    WHERE date_trunc('month', txn_date) = '2020-12-01'
)
SELECT dt.*
    ,t.*
    ,datediff('days', dt.win_end, t.txn_date) as win_time
FROM dec_txn AS dt
LEFT JOIN data AS t 
    ON t.cust_id = dt.cust_id 
    AND t.txn_date between dt.win_start and win_end AND t.txn_id != dt.txn_id
    ;
其中:

TXN_ID   CUST_ID    WIN_START                 WIN_END                   TXN_DATE                  CUST_ID   TXN_ID   WIN_TIME
1        1          2020-09-04 00:00:00.000   2020-12-03 00:00:00.000   2020-11-04 00:00:00.000   1         2        -29
1        1          2020-09-04 00:00:00.000   2020-12-03 00:00:00.000   2020-10-04 00:00:00.000   1         3        -60
1        1          2020-09-04 00:00:00.000   2020-12-03 00:00:00.000   2020-09-04 00:00:00.000   1         4        -90
0        0          2020-09-05 00:00:00.000   2020-12-04 00:00:00.000   NULL                      NULL      NULL     NULL
因此,我们:

WITH data AS (
    SELECT txn_date::timestamp_ntz as txn_date, cust_id, txn_id
    FROM VALUES
        ('2020-12-04',0, 0),   
        ('2020-12-03',1, 1),   
        ('2020-11-04',1, 2),
        ('2020-10-04',1, 3),
        ('2020-09-04',1, 4), -- just on 90 days
        ('2020-09-02',1, 5), -- too far
        ('2021-01-05',1, 6) -- in the future
        v(txn_date , cust_id, txn_id)
), dec_txn AS (
    SELECT txn_id,
        cust_id,
        txn_date,
        DATEADD('day',-90, txn_date) AS win_start,
        txn_date AS win_end
    FROM data 
    WHERE date_trunc('month', txn_date) = '2020-12-01'
)
SELECT dt.cust_id
    ,dt.txn_id
    ,dt.txn_date
    ,count(t.txn_id) as c__prior_90_days_transaction
FROM dec_txn AS dt
LEFT JOIN data AS t 
ON t.cust_id = dt.cust_id 
AND t.txn_date >= dt.win_start and t.txn_date < dt.win_end AND t.txn_id != dt.txn_id
GROUP BY 1,2,3
ORDER BY 1,2
;
问题中没有明确定义的是,如果12月份有多个客户的请求,该怎么办 如果在同一个12月日有多笔交易,该怎么办

上面将为每个客户的每个Dec事务返回一行,其中包括当天发生的事务。但是如果您的日期/时间戳有时间,那么它将只计算同一天早些时候的转换次数。 但是如果你想要前几天,而txn_日期只是一个日期,那么

AND t.txn_date >= dt.win_start and t.txn_date < dt.win_end AND t.txn_id != dt.txn_id
现在,窗口时间戳被截断为天,那么如果您希望午夜事务计算当天的时间,或者如果您没有午夜时间戳,则必须进行训练

AND t.txn_date >= dt.win_start and t.txn_date < dt.win_end AND t.txn_id != dt.txn_id
dec_txn AS (
    SELECT txn_id,
        cust_id,
        DATEADD('day',-90, txn_date::date) AS win_start,
        txn_date::date AS win_end
    FROM data 
    WHERE date_trunc('month', txn_date) = '2020-12-01'