Snowflake cloud data platform 使用非等效于ON的外部联接
为什么以下查询不产生相同的结果Snowflake cloud data platform 使用非等效于ON的外部联接,snowflake-cloud-data-platform,Snowflake Cloud Data Platform,为什么以下查询不产生相同的结果 with l as (select $1 id from values(1), (2), (3)) , r as (select $1 id from values(1), (4)) select l.*,r.* from l full outer join r using(id); ID ID 1 1 2 2 3 3 4 4 with l as (select $1 id from values(1), (2), (3)) ,
with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r using(id);
ID ID
1 1
2 2
3 3
4 4
with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r on r.id = l.id;
ID ID
1 1
2
3
4
比如说:
o1joino2-using(key\u-column)
相当于o1joino2-on-o2.key\u-column=o1.key\u-column
我想这属于非标准用法,所以不要这样做。具体而言:
要正确使用USING子句,投影列表(SELECT关键字后面的列和其他表达式列表)应为“*
”
SparkSQL产生与Snowflake相同的结果,但是psql产生我期望的结果,所以。。。我想这是不一致的
scala> spark.sql("with l as (select col1 id from values(1), (2), (3)) , r as (select col1 id from values(1), (4)) select * from l full outer join r using(id)").show()
+---+
| id|
+---+
| 1|
| 3|
| 4|
| 2|
+---+
scala> spark.sql("with l as (select col1 id from values(1), (2), (3)) , r as (select col1 id from values(1), (4)) select * from l full outer join r on l.id = r.id").show()
+----+----+
| id| id|
+----+----+
| 1| 1|
| 3|null|
|null| 4|
| 2|null|
+----+----+
psql> with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r using(id);
id id
1 1
2 (null)
3 (null)
(null) 4
这种行为看起来确实很奇怪。建议的方法是在上使用
,而不是使用
来自雪花社区讨论:
根据ANSI标准
来自t1
全外连接t2
使用(c)
生成以下表达式:coalesce(t1.c,t2.c)as c。因此,标准中实际上没有定义t1.c和t2.c的后续引用。MySQL、Postgres和Snowflake都支持这些引用,但使用不同的语义。在Snowflake中,t1.c和t2.c只是c的别名
在雪花中执行
SELECT*
,而不是SELECT l.*,r.*
,会得到什么结果?与Spark SQL相同:1-列包含两个表的id。