Snowflake cloud data platform 需要帮助了解为什么多个左联接不是'；在雪花中，我不会回来_Snowflake Cloud Data Platform

Snowflake cloud data platform 需要帮助了解为什么多个左联接不是'；在雪花中，我不会回来

snowflake-cloud-data-platform

Snowflake cloud data platform 需要帮助了解为什么多个左联接不是'；在雪花中，我不会回来,snowflake-cloud-data-platform,Snowflake Cloud Data Platform,有一些问题与多个左连接没有做我期望他们 select sent.id, sent.ts, sent.email, delivered.ts, type.label, min(opens.ts) as first_open, count(opens.id) as open_count, min(clicks.ts) as first_click, count(clicks.id) as click_count from se

有一些问题与多个左连接没有做我期望他们

select 
    sent.id,
    sent.ts,
    sent.email,
    delivered.ts,
    type.label,
    min(opens.ts) as first_open,
    count(opens.id) as open_count,
    min(clicks.ts) as first_click,
    count(clicks.id) as click_count
from sent
inner join type on type.id = sent.type_id
left outer join delivered on (delivered.id = sent.id)
left outer join opens on (opens.id = sent.id)
left outer join clicks on (clicks.id = sent.id)
where sent.id = 'a1b1c1d1e1'
group by 
    sent.id,
    sent.ts,
    sent.email,
    delivered.ts,
    type.label,
    opens.id,
    clicks.id
;

一条消息被发送，然后被传递；这是1比1，但是，交付可能不存在

然后可以打开（多次）和单击（多次）消息，所有这些都与sent.id绑定在一起

如果我只是打开连接，它工作得很好，但是，如果我只是点击连接

当我添加点击时，首先点击

并点击并点击计数显示与打开相同的值
我得到：
12020-01-01 00:00:00，a@b.com，2020-01-01 00:00:00，测试，2020-01-01:00:00,42020-01-01-01:00:00,4

何时应该：
12020-01-01 00:00:00，a@b.com，2020-01-01 00:00:00，测试，2020-01-01 01:00:00,42020-01-01-01 02:00:00,1

我尝试过在没有查询缓存的情况下运行（ALTER SESSION SET USE\u CACHED\u RESULT=false；
），并在MySQL中做了一个基本镜像，以证明连接是正确的。
因此，我试图弥合问题描述和您提到的结果之间的差距
从已知数据开始
create or replace table sent (id text, ts timestamp_ntz, email text, type_id number);
create or replace table type (id number, label text);
create or replace table delivered(id text, ts timestamp_ntz);
create or replace table opens(id text, ts timestamp_ntz);
create or replace table clicks(id text, ts timestamp_ntz);

insert into sent values ('a1b1c1d1e1', '2020-01-01 01:00', 'a@b.com', 1);
insert into delivered values ('a1b1c1d1e1', '2020-01-01 02:00');
insert into type values (1, 'test');
insert into opens values ('a1b1c1d1e1', '2020-01-01 03:00'),('a1b1c1d1e1', '2020-01-01 04:00'),('a1b1c1d1e1', '2020-01-01 05:00'),('a1b1c1d1e1', '2020-01-01 06:00');
insert into clicks values ('a1b1c1d1e1', '2020-01-01 07:00');

select 
    sent.id
    ,sent.ts
    ,sent.email
    ,delivered.ts
    ,type.label
    ,min(opens.ts) as first_open
    ,count(opens.id) as open_count
    ,min(clicks.ts) as first_click
    ,count(clicks.id) as click_count
from sent
join type on type.id = sent.type_id
left join delivered on (delivered.id = sent.id)
left join opens on (opens.id = sent.id)
left join clicks on (clicks.id = sent.id)
where sent.id = 'a1b1c1d1e1'
group by 1,2,3,4, 5;

我将列名交换到它们的位置，因为我喜欢这种方式，但您不需要打开.id
或单击.id
，因为这些列在非聚合列中未被选中
 ID TS  EMAIL   TS  LABEL   FIRST_OPEN  OPEN_COUNT  FIRST_CLICK CLICK_COUNT
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 03:00:00.000 4   2020-01-01 07:00:00.000 4

我不确定你正在改变什么样的行为。。但是打印所有的行，看看发生了什么，了解为什么你没有得到你想要的，这可能会有帮助
select 
    sent.id
    ,sent.ts
    ,sent.email
    ,delivered.ts
    ,type.label
    ,opens.ts as open_ts
    ,clicks.ts as click_ts
    --,min(opens.ts) as first_open
    --,count(opens.id) as open_count
    --,min(clicks.ts) as first_click
    --,count(clicks.id) as click_count
from sent
join type on type.id = sent.type_id
left join delivered on (delivered.id = sent.id)
left join opens on (opens.id = sent.id)
left join clicks on (clicks.id = sent.id)
where sent.id = 'a1b1c1d1e1'
--group by 1,2,3,4, 5;

给我：
 ID TS  EMAIL   TS  LABEL   OPEN_TS CLICK_TS
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 03:00:00.000 2020-01-01 07:00:00.000
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 04:00:00.000 2020-01-01 07:00:00.000
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 05:00:00.000 2020-01-01 07:00:00.000
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 06:00:00.000 2020-01-01 07:00:00.000

这就是我对乙醚左键或正常内键的期望。。
请随意使用SQL进行更新，它会为您提供不完整的结果，以及上面列出的输出版本，以获得更好的解释。
因此，请尝试弥合问题描述和您提到的结果之间的差距
从已知数据开始
create or replace table sent (id text, ts timestamp_ntz, email text, type_id number);
create or replace table type (id number, label text);
create or replace table delivered(id text, ts timestamp_ntz);
create or replace table opens(id text, ts timestamp_ntz);
create or replace table clicks(id text, ts timestamp_ntz);

insert into sent values ('a1b1c1d1e1', '2020-01-01 01:00', 'a@b.com', 1);
insert into delivered values ('a1b1c1d1e1', '2020-01-01 02:00');
insert into type values (1, 'test');
insert into opens values ('a1b1c1d1e1', '2020-01-01 03:00'),('a1b1c1d1e1', '2020-01-01 04:00'),('a1b1c1d1e1', '2020-01-01 05:00'),('a1b1c1d1e1', '2020-01-01 06:00');
insert into clicks values ('a1b1c1d1e1', '2020-01-01 07:00');

select 
    sent.id
    ,sent.ts
    ,sent.email
    ,delivered.ts
    ,type.label
    ,min(opens.ts) as first_open
    ,count(opens.id) as open_count
    ,min(clicks.ts) as first_click
    ,count(clicks.id) as click_count
from sent
join type on type.id = sent.type_id
left join delivered on (delivered.id = sent.id)
left join opens on (opens.id = sent.id)
left join clicks on (clicks.id = sent.id)
where sent.id = 'a1b1c1d1e1'
group by 1,2,3,4, 5;

我将列名交换到它们的位置，因为我喜欢这种方式，但您不需要打开.id
或单击.id
，因为这些列在非聚合列中未被选中
 ID TS  EMAIL   TS  LABEL   FIRST_OPEN  OPEN_COUNT  FIRST_CLICK CLICK_COUNT
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 03:00:00.000 4   2020-01-01 07:00:00.000 4

我不确定你正在改变什么样的行为。。但是打印所有的行，看看发生了什么，了解为什么你没有得到你想要的，这可能会有帮助
select 
    sent.id
    ,sent.ts
    ,sent.email
    ,delivered.ts
    ,type.label
    ,opens.ts as open_ts
    ,clicks.ts as click_ts
    --,min(opens.ts) as first_open
    --,count(opens.id) as open_count
    --,min(clicks.ts) as first_click
    --,count(clicks.id) as click_count
from sent
join type on type.id = sent.type_id
left join delivered on (delivered.id = sent.id)
left join opens on (opens.id = sent.id)
left join clicks on (clicks.id = sent.id)
where sent.id = 'a1b1c1d1e1'
--group by 1,2,3,4, 5;

给我：
 ID TS  EMAIL   TS  LABEL   OPEN_TS CLICK_TS
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 03:00:00.000 2020-01-01 07:00:00.000
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 04:00:00.000 2020-01-01 07:00:00.000
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 05:00:00.000 2020-01-01 07:00:00.000
 a1b1c1d1e1 2020-01-01 01:00:00.000 a@b.com 2020-01-01 02:00:00.000 test    2020-01-01 06:00:00.000 2020-01-01 07:00:00.000

这就是我对乙醚左键或正常内键的期望。。
请随时使用SQL更新，以获得更好的解释。该SQL将为您提供不完整的结果，以及上面列出的输出版本。
如果没有示例数据来重现问题，则无法回答。关于您的代码，我想到的唯一一件事是，您是否真的想按opens.id和clicks.id进行分组？这对聚合没有意义，因为有问题。您的SQL select将sent.id
作为第一列，但是您有一个where子句，它是where sent.id='a1b1c1d1e1'
，并且您的示例显示1
在没有示例数据重现问题时无法回答。关于您的代码，我想到的唯一一件事是，您是否真的想按opens.id和clicks.id进行分组？这对聚合没有意义，因为有问题。您的SQL select将sent.id
作为第一列，但您有一个where子句，它是where sent.id='a1b1c1d1e1'
，示例显示1