Sql 具有不同日期数据的完全外部自联接
我想将这些值与第二天的值进行比较(还想看看哪种颜色是新的或没有出现) 我已经完成了完整的外部自连接,并替换了“is_matched”右侧部分中的空值。Is_matched向我们显示联接是否有效,或者右侧的部分在没有合并的情况下是否为空 唯一的问题是最后一列“这不起作用”。它应该有“date\u local2”的total\u colors值,而不是“date\u local”的total\u colors值,我不知道如何用“this\u is\u not\u working”列中的值替换所有空值。我试过使用窗口函数和间隔,但没有真正起作用 我用Postgres创建了这个,但我用的是PrestoSql 具有不同日期数据的完全外部自联接,sql,date,group-by,presto,full-outer-join,Sql,Date,Group By,Presto,Full Outer Join,我想将这些值与第二天的值进行比较(还想看看哪种颜色是新的或没有出现) 我已经完成了完整的外部自连接,并替换了“is_matched”右侧部分中的空值。Is_matched向我们显示联接是否有效,或者右侧的部分在没有合并的情况下是否为空 唯一的问题是最后一列“这不起作用”。它应该有“date\u local2”的total\u colors值,而不是“date\u local”的total\u colors值,我不知道如何用“this\u is\u not\u working”列中的值替换所有空值
select * from colours
日期|本地|颜色|颜色|数量|总颜色|匹配|日期|本地|颜色|颜色|颜色|数量|颜色|差异|数量|差异|这不起作用
:--------- | :----- | ------: | -----: | -----------: | :--------- | :------------------ | :------- | -------: | -------: | ----------------: | ----------------: | ------------------:
2020-01-01 |红色| 1 | 25 | 6 |空| 2020-01-02 00:00:00 |红色| 0 | 0 | 1 | 25 |空
2020-01-01 |绿| 1 | 20 | 6 |空| 2020-01-02 00:00 |绿| 0 | 0 | 1 | 20 |空
2020-01-01 |白色| 4 | 40 | 6 | 2020-01-02 | 2020-01-02 00:00 |白色| 5 | 50 | 1 | 10 | 9
2020-01-02粉红的4-60-9-2020-01-03-2020-01-03 00:00:00粉红的3-45-1-15-6
2020-01-02 |白色| 5 | 50 | 9 |零| 2020-01-03 00:00:00 |白色| 0 | 0 | 5 | 50 |零
2020-01-03粉色3-45-6-null 2020-01-04 00:00:00粉色0-0-3-45-null
2020-01-03 |绿色| 3 | 60 | 6 |空| 2020-01-04 00:00:00 |绿色| 0 | 0 | 3 | 60 |空
空|空|空|空|空| 2020-01-02 | 2020-01-02 00:00:00 |粉红色| 4 | 60 |空|空| 9
空|空|空|空|空| 2020-01-01 | 2020-01-01 00:00:00 |白色| 4 | 40 |空|空| 6
空|空|空|空|空| 2020-01-03 | 2020-01-03 00:00:00 |绿色| 3 | 60 |空|空| 6
空|空|空|空|空| 2020-01-01 | 2020-01-01 00:00:00 |绿色| 1 | 20 |空|空| 6
空|空|空|空|空| 2020-01-01 | 2020-01-01 00:00:00 |红色| 1 | 25 |空|空| 6
我不认为你需要一个
完全连接
。窗口功能可以完成工作:
select
date_local,
colour,
no_colour,
sum_amount,
total_colour,
is_matched,
case when is_matched = 1
then lead(date_local) over(partition by colour order by date_local)
end date_local_2,
case when is_matched = 1
then lead(colour) over(partition by colour order by date_local)
end colour_2,
case when is_matched = 1
then lead(no_colour) over(partition by colour order by date_local)
end no_colour_2,
case when is_matched = 1
then lead(sum_amount) over(partition by colour order by date_local)
end sum_amount_2,
case when is_matched = 1
then lead(total_colour) over(partition by colour order by date_local)
end total_colour_2
from (
select
date_local,
colour,
count(*) no_colour,
sum(amount) sum_amount,
case when lead(date_local) over(partition by colour order by date_local)
= date_local + interval '1' day
then 1
end is_matched,
sum(count(*)) over(partition by date_local) total_colour
from colours
group by date_local, colour
) t
order by date_local, colour
内部查询按天和颜色进行聚合,并计算组级指标以及每天记录的总计数;它还设置一个标志,指示第二天是否存在相同颜色的“相邻”记录
然后,外部查询使用窗口函数lead()
从相邻行恢复值
在中,这将产生:
date_local | colour | no_colour | sum_amount | total_colour | is_matched | date_local_2 | colour_2 | no_colour_2 | sum_amount_2 | total_colour_2
:--------- | :----- | --------: | ---------: | -----------: | ---------: | :----------- | :------- | ----------: | -----------: | -------------:
2020-01-01 | green | 1 | 20 | 6 | null | null | null | null | null | null
2020-01-01 | red | 1 | 25 | 6 | null | null | null | null | null | null
2020-01-01 | white | 4 | 40 | 6 | 1 | 2020-01-02 | white | 5 | 50 | 9
2020-01-02 | pink | 4 | 60 | 9 | 1 | 2020-01-03 | pink | 3 | 45 | 6
2020-01-02 | white | 5 | 50 | 9 | null | null | null | null | null | null
2020-01-03 | green | 3 | 60 | 6 | null | null | null | null | null | null
2020-01-03 | pink | 3 | 45 | 6 | null | null | null | null | null | null
日期|本地|颜色|无颜色|金额|总颜色|匹配|日期|本地|颜色| 2 |无颜色|金额|金额|总颜色|2
:--------- | :----- | --------: | ---------: | -----------: | ---------: | :----------- | :------- | ----------: | -----------: | -------------:
2020-01-01 |绿色| 1 | 20 | 6 |零|零|零|零|零
2020-01-01 |红色| 1 | 25 | 6 |空|空|空|空|空
2020-01-01 |白色| 4 | 40 | 6 | 1 | 2020-01-02 |白色| 5 | 50 | 9
2020-01-02 |粉红| 4 | 60 | 9 | 1 | 2020-01-03 |粉红| 3 | 45 | 6
2020-01-02 |白色| 5 | 50 | 9 |零|零|零|零|零
2020-01-03 |绿色| 3 | 60 | 6 |零|零|零|零|零
2020-01-03 |粉红| 3 | 45 | 6 |空|空|空|空|空
date_local | colour | colours | amount | total_colour | is_matched | date_local_2 | colour_2 | colour_2 | amount_2 | colour_difference | amount_difference | this_is_not_working
:--------- | :----- | ------: | -----: | -----------: | :--------- | :------------------ | :------- | -------: | -------: | ----------------: | ----------------: | ------------------:
2020-01-01 | red | 1 | 25 | 6 | null | 2020-01-02 00:00:00 | red | 0 | 0 | 1 | 25 | null
2020-01-01 | green | 1 | 20 | 6 | null | 2020-01-02 00:00:00 | green | 0 | 0 | 1 | 20 | null
2020-01-01 | white | 4 | 40 | 6 | 2020-01-02 | 2020-01-02 00:00:00 | white | 5 | 50 | 1 | 10 | 9
2020-01-02 | pink | 4 | 60 | 9 | 2020-01-03 | 2020-01-03 00:00:00 | pink | 3 | 45 | -1 | -15 | 6
2020-01-02 | white | 5 | 50 | 9 | null | 2020-01-03 00:00:00 | white | 0 | 0 | 5 | 50 | null
2020-01-03 | pink | 3 | 45 | 6 | null | 2020-01-04 00:00:00 | pink | 0 | 0 | 3 | 45 | null
2020-01-03 | green | 3 | 60 | 6 | null | 2020-01-04 00:00:00 | green | 0 | 0 | 3 | 60 | null
null | null | null | null | null | 2020-01-02 | 2020-01-02 00:00:00 | pink | 4 | 60 | null | null | 9
null | null | null | null | null | 2020-01-01 | 2020-01-01 00:00:00 | white | 4 | 40 | null | null | 6
null | null | null | null | null | 2020-01-03 | 2020-01-03 00:00:00 | green | 3 | 60 | null | null | 6
null | null | null | null | null | 2020-01-01 | 2020-01-01 00:00:00 | green | 1 | 20 | null | null | 6
null | null | null | null | null | 2020-01-01 | 2020-01-01 00:00:00 | red | 1 | 25 | null | null | 6
select
date_local,
colour,
no_colour,
sum_amount,
total_colour,
is_matched,
case when is_matched = 1
then lead(date_local) over(partition by colour order by date_local)
end date_local_2,
case when is_matched = 1
then lead(colour) over(partition by colour order by date_local)
end colour_2,
case when is_matched = 1
then lead(no_colour) over(partition by colour order by date_local)
end no_colour_2,
case when is_matched = 1
then lead(sum_amount) over(partition by colour order by date_local)
end sum_amount_2,
case when is_matched = 1
then lead(total_colour) over(partition by colour order by date_local)
end total_colour_2
from (
select
date_local,
colour,
count(*) no_colour,
sum(amount) sum_amount,
case when lead(date_local) over(partition by colour order by date_local)
= date_local + interval '1' day
then 1
end is_matched,
sum(count(*)) over(partition by date_local) total_colour
from colours
group by date_local, colour
) t
order by date_local, colour
date_local | colour | no_colour | sum_amount | total_colour | is_matched | date_local_2 | colour_2 | no_colour_2 | sum_amount_2 | total_colour_2
:--------- | :----- | --------: | ---------: | -----------: | ---------: | :----------- | :------- | ----------: | -----------: | -------------:
2020-01-01 | green | 1 | 20 | 6 | null | null | null | null | null | null
2020-01-01 | red | 1 | 25 | 6 | null | null | null | null | null | null
2020-01-01 | white | 4 | 40 | 6 | 1 | 2020-01-02 | white | 5 | 50 | 9
2020-01-02 | pink | 4 | 60 | 9 | 1 | 2020-01-03 | pink | 3 | 45 | 6
2020-01-02 | white | 5 | 50 | 9 | null | null | null | null | null | null
2020-01-03 | green | 3 | 60 | 6 | null | null | null | null | null | null
2020-01-03 | pink | 3 | 45 | 6 | null | null | null | null | null | null