Sql 具有不同日期数据的完全外部自联接

Sql 具有不同日期数据的完全外部自联接,sql,date,group-by,presto,full-outer-join,Sql,Date,Group By,Presto,Full Outer Join,我想将这些值与第二天的值进行比较(还想看看哪种颜色是新的或没有出现) 我已经完成了完整的外部自连接,并替换了“is_matched”右侧部分中的空值。Is_matched向我们显示联接是否有效,或者右侧的部分在没有合并的情况下是否为空 唯一的问题是最后一列“这不起作用”。它应该有“date\u local2”的total\u colors值,而不是“date\u local”的total\u colors值,我不知道如何用“this\u is\u not\u working”列中的值替换所有空值

我想将这些值与第二天的值进行比较(还想看看哪种颜色是新的或没有出现)

我已经完成了完整的外部自连接,并替换了“is_matched”右侧部分中的空值。Is_matched向我们显示联接是否有效,或者右侧的部分在没有合并的情况下是否为空

唯一的问题是最后一列“这不起作用”。它应该有“date\u local2”的total\u colors值,而不是“date\u local”的total\u colors值,我不知道如何用“this\u is\u not\u working”列中的值替换所有空值。我试过使用窗口函数和间隔,但没有真正起作用

我用Postgres创建了这个,但我用的是Presto

select * from colours
日期|本地|颜色|颜色|数量|总颜色|匹配|日期|本地|颜色|颜色|颜色|数量|颜色|差异|数量|差异|这不起作用 :--------- | :----- | ------: | -----: | -----------: | :--------- | :------------------ | :------- | -------: | -------: | ----------------: | ----------------: | ------------------: 2020-01-01 |红色| 1 | 25 | 6 |空| 2020-01-02 00:00:00 |红色| 0 | 0 | 1 | 25 |空 2020-01-01 |绿| 1 | 20 | 6 |空| 2020-01-02 00:00 |绿| 0 | 0 | 1 | 20 |空 2020-01-01 |白色| 4 | 40 | 6 | 2020-01-02 | 2020-01-02 00:00 |白色| 5 | 50 | 1 | 10 | 9 2020-01-02粉红的4-60-9-2020-01-03-2020-01-03 00:00:00粉红的3-45-1-15-6 2020-01-02 |白色| 5 | 50 | 9 |零| 2020-01-03 00:00:00 |白色| 0 | 0 | 5 | 50 |零 2020-01-03粉色3-45-6-null 2020-01-04 00:00:00粉色0-0-3-45-null 2020-01-03 |绿色| 3 | 60 | 6 |空| 2020-01-04 00:00:00 |绿色| 0 | 0 | 3 | 60 |空 空|空|空|空|空| 2020-01-02 | 2020-01-02 00:00:00 |粉红色| 4 | 60 |空|空| 9 空|空|空|空|空| 2020-01-01 | 2020-01-01 00:00:00 |白色| 4 | 40 |空|空| 6 空|空|空|空|空| 2020-01-03 | 2020-01-03 00:00:00 |绿色| 3 | 60 |空|空| 6 空|空|空|空|空| 2020-01-01 | 2020-01-01 00:00:00 |绿色| 1 | 20 |空|空| 6 空|空|空|空|空| 2020-01-01 | 2020-01-01 00:00:00 |红色| 1 | 25 |空|空| 6
我不认为你需要一个
完全连接
。窗口功能可以完成工作:

select 
    date_local,
    colour,
    no_colour,
    sum_amount,
    total_colour,
    is_matched,
    case when is_matched = 1
        then lead(date_local) over(partition by colour order by date_local)
    end date_local_2,
    case when is_matched = 1
        then lead(colour) over(partition by colour order by date_local) 
    end colour_2,
    case when is_matched = 1 
        then lead(no_colour) over(partition by colour order by date_local) 
    end no_colour_2,
    case when is_matched = 1 
        then lead(sum_amount) over(partition by colour order by date_local) 
    end sum_amount_2,
    case when is_matched = 1 
        then lead(total_colour) over(partition by colour order by date_local) 
    end total_colour_2  
from (
    select
        date_local,
        colour,
        count(*) no_colour,
        sum(amount) sum_amount,
        case when lead(date_local) over(partition by colour order by date_local) 
            = date_local + interval '1' day
            then 1
        end is_matched,
        sum(count(*)) over(partition by date_local) total_colour
    from colours
    group by date_local, colour
) t
order by date_local, colour
内部查询按天和颜色进行聚合,并计算组级指标以及每天记录的总计数;它还设置一个标志,指示第二天是否存在相同颜色的“相邻”记录

然后,外部查询使用窗口函数
lead()
从相邻行恢复值

中,这将产生:

date_local | colour | no_colour | sum_amount | total_colour | is_matched | date_local_2 | colour_2 | no_colour_2 | sum_amount_2 | total_colour_2 :--------- | :----- | --------: | ---------: | -----------: | ---------: | :----------- | :------- | ----------: | -----------: | -------------: 2020-01-01 | green | 1 | 20 | 6 | null | null | null | null | null | null 2020-01-01 | red | 1 | 25 | 6 | null | null | null | null | null | null 2020-01-01 | white | 4 | 40 | 6 | 1 | 2020-01-02 | white | 5 | 50 | 9 2020-01-02 | pink | 4 | 60 | 9 | 1 | 2020-01-03 | pink | 3 | 45 | 6 2020-01-02 | white | 5 | 50 | 9 | null | null | null | null | null | null 2020-01-03 | green | 3 | 60 | 6 | null | null | null | null | null | null 2020-01-03 | pink | 3 | 45 | 6 | null | null | null | null | null | null 日期|本地|颜色|无颜色|金额|总颜色|匹配|日期|本地|颜色| 2 |无颜色|金额|金额|总颜色|2 :--------- | :----- | --------: | ---------: | -----------: | ---------: | :----------- | :------- | ----------: | -----------: | -------------: 2020-01-01 |绿色| 1 | 20 | 6 |零|零|零|零|零 2020-01-01 |红色| 1 | 25 | 6 |空|空|空|空|空 2020-01-01 |白色| 4 | 40 | 6 | 1 | 2020-01-02 |白色| 5 | 50 | 9 2020-01-02 |粉红| 4 | 60 | 9 | 1 | 2020-01-03 |粉红| 3 | 45 | 6 2020-01-02 |白色| 5 | 50 | 9 |零|零|零|零|零 2020-01-03 |绿色| 3 | 60 | 6 |零|零|零|零|零 2020-01-03 |粉红| 3 | 45 | 6 |空|空|空|空|空 date_local | colour | colours | amount | total_colour | is_matched | date_local_2 | colour_2 | colour_2 | amount_2 | colour_difference | amount_difference | this_is_not_working :--------- | :----- | ------: | -----: | -----------: | :--------- | :------------------ | :------- | -------: | -------: | ----------------: | ----------------: | ------------------: 2020-01-01 | red | 1 | 25 | 6 | null | 2020-01-02 00:00:00 | red | 0 | 0 | 1 | 25 | null 2020-01-01 | green | 1 | 20 | 6 | null | 2020-01-02 00:00:00 | green | 0 | 0 | 1 | 20 | null 2020-01-01 | white | 4 | 40 | 6 | 2020-01-02 | 2020-01-02 00:00:00 | white | 5 | 50 | 1 | 10 | 9 2020-01-02 | pink | 4 | 60 | 9 | 2020-01-03 | 2020-01-03 00:00:00 | pink | 3 | 45 | -1 | -15 | 6 2020-01-02 | white | 5 | 50 | 9 | null | 2020-01-03 00:00:00 | white | 0 | 0 | 5 | 50 | null 2020-01-03 | pink | 3 | 45 | 6 | null | 2020-01-04 00:00:00 | pink | 0 | 0 | 3 | 45 | null 2020-01-03 | green | 3 | 60 | 6 | null | 2020-01-04 00:00:00 | green | 0 | 0 | 3 | 60 | null null | null | null | null | null | 2020-01-02 | 2020-01-02 00:00:00 | pink | 4 | 60 | null | null | 9 null | null | null | null | null | 2020-01-01 | 2020-01-01 00:00:00 | white | 4 | 40 | null | null | 6 null | null | null | null | null | 2020-01-03 | 2020-01-03 00:00:00 | green | 3 | 60 | null | null | 6 null | null | null | null | null | 2020-01-01 | 2020-01-01 00:00:00 | green | 1 | 20 | null | null | 6 null | null | null | null | null | 2020-01-01 | 2020-01-01 00:00:00 | red | 1 | 25 | null | null | 6
select 
    date_local,
    colour,
    no_colour,
    sum_amount,
    total_colour,
    is_matched,
    case when is_matched = 1
        then lead(date_local) over(partition by colour order by date_local)
    end date_local_2,
    case when is_matched = 1
        then lead(colour) over(partition by colour order by date_local) 
    end colour_2,
    case when is_matched = 1 
        then lead(no_colour) over(partition by colour order by date_local) 
    end no_colour_2,
    case when is_matched = 1 
        then lead(sum_amount) over(partition by colour order by date_local) 
    end sum_amount_2,
    case when is_matched = 1 
        then lead(total_colour) over(partition by colour order by date_local) 
    end total_colour_2  
from (
    select
        date_local,
        colour,
        count(*) no_colour,
        sum(amount) sum_amount,
        case when lead(date_local) over(partition by colour order by date_local) 
            = date_local + interval '1' day
            then 1
        end is_matched,
        sum(count(*)) over(partition by date_local) total_colour
    from colours
    group by date_local, colour
) t
order by date_local, colour
date_local | colour | no_colour | sum_amount | total_colour | is_matched | date_local_2 | colour_2 | no_colour_2 | sum_amount_2 | total_colour_2 :--------- | :----- | --------: | ---------: | -----------: | ---------: | :----------- | :------- | ----------: | -----------: | -------------: 2020-01-01 | green | 1 | 20 | 6 | null | null | null | null | null | null 2020-01-01 | red | 1 | 25 | 6 | null | null | null | null | null | null 2020-01-01 | white | 4 | 40 | 6 | 1 | 2020-01-02 | white | 5 | 50 | 9 2020-01-02 | pink | 4 | 60 | 9 | 1 | 2020-01-03 | pink | 3 | 45 | 6 2020-01-02 | white | 5 | 50 | 9 | null | null | null | null | null | null 2020-01-03 | green | 3 | 60 | 6 | null | null | null | null | null | null 2020-01-03 | pink | 3 | 45 | 6 | null | null | null | null | null | null