Sql 如何创建只计算其他列在红移时的更改的列？_Sql_Count_Amazon Redshift_Window Functions_Lag

Sql 如何创建只计算其他列在红移时的更改的列？

sql amazon-redshift

Sql 如何创建只计算其他列在红移时的更改的列？,sql,count,amazon-redshift,window-functions,lag,Sql,Count,Amazon Redshift,Window Functions,Lag,我有以下数据集： product customer date value buyer_position A 123455 2020-01-01 00:01:01 100 1 A 123456 2020-01-02 00:02:01 100 2 A 523455 2020-01-02 00:02:05

我有以下数据集：

product   customer    date                        value     buyer_position
A         123455      2020-01-01 00:01:01         100       1
A         123456      2020-01-02 00:02:01         100       2
A         523455      2020-01-02 00:02:05         100       NULL
A         323455      2020-01-03 00:02:07         100       NULL
A         423455      2020-01-03 00:09:01         100       3
B         100455      2020-01-01 00:03:01         100       1
B         999445      2020-01-01 00:04:01         100       NULL
B         122225      2020-01-01 00:04:05         100       2
B         993848      2020-01-01 10:04:05         100       3
B         133225      2020-01-01 11:04:05         100       NULL
B         144225      2020-01-01 12:04:05         100       4

数据集包含公司销售的产品和看到该产品的客户。客户可以看到多个产品，但组合产品+客户没有任何重复。我想知道有多少人在客户看到产品之前购买了它

这将是完美的输出：

product   customer    date                        value     buyer_position     people_before
A         123455      2020-01-01 00:01:01         100       1                  0
A         123456      2020-01-02 00:02:01         100       2                  1
A         523455      2020-01-02 00:02:05         100       NULL               2
A         323455      2020-01-03 00:02:07         100       NULL               2
A         423455      2020-01-03 00:09:01         100       3                  2
B         100455      2020-01-01 00:03:01         100       1                  0
B         999445      2020-01-01 00:04:01         100       NULL               1
B         122225      2020-01-01 00:04:05         100       2                  1
B         993848      2020-01-01 10:04:05         100       3                  2
B         133225      2020-01-01 11:04:05         100       NULL               3
B         144225      2020-01-01 12:04:05         100       4                  3

正如你所看到的，当客户看到他想要的产品时，已经有两个人买了它。在客户323455的案例中，有两个人已经购买了产品A

我想我应该使用一些窗口函数，比如lag（）。但是lag（）函数不会获得这种“累积”信息。所以我有点迷路了。

这看起来像是前面几行中

buyer\u position

值的非

null

窗口计数：

select t.*,
    coalesce(count(buyer_position) over(
        partition by product
        order by date
        rows between unbounded preceding and 1 preceding
    ), 0) as people_before
from mytable t

嗯。如果我理解正确，您希望客户/产品的最大买家位置减去1：

select t.*,
       max(buyer_position) over (partition by customer, product order by date rows between unbounded preceding and current row) - 1
from t;

它不起作用。当买方职位栏为空时，栏前人员也为空。这不是我想要做的。@dummyds。您的示例数据中没有第一个买家位置为

NULL

的示例，因此不清楚在这种情况下您想要什么。