Sql JOIN正在放大我的总数,尽管它是正确的
我将从两个独立的查询开始,这两个查询分别给出正确的结果:Sql JOIN正在放大我的总数,尽管它是正确的,sql,google-bigquery,Sql,Google Bigquery,我将从两个独立的查询开始,这两个查询分别给出正确的结果: SELECT DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start, SUM(traffic) traffic FROM `ga.daily_traffic` WHERE traffic_date >= '2019-03-04' 返回: +--------------+---------+ | traffic_week | traffic | +--------
SELECT
DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start,
SUM(traffic) traffic
FROM
`ga.daily_traffic`
WHERE traffic_date >= '2019-03-04'
返回:
+--------------+---------+
| traffic_week | traffic |
+--------------+---------+
| 2019-03-04 | 66572 |
+--------------+---------+
第二个问题:
SELECT
week_start,
SUM(traffic) traffic
FROM
`marketing.channel_spend`
WHERE week_start = '2019-03-04'
返回:
+------------+----------+
| week_start | spend |
+------------+----------+
| 2019-03-04 | 80143.07 |
+------------+----------+
对于第二个查询,我应该注意:字段week\u start
已经以每周增量存储,这可能是我将这两个字段合并在一起时出现这种情况的原因,例如:
SELECT
week_start,
SUM(spend) spend,
SUM(traffic) traffic
FROM
`ga.daily_traffic` ga
LEFT JOIN `marketing.channel_spend` chan
ON DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) = chan.week_start
WHERE week_start = '2019-03-04'
GROUP BY 1
ORDER BY 1 DESC
产生以下结果:
+------------+---------+-----------+
| week_start | traffic | spend |
+------------+---------+-----------+
| 2019-03-04 | 153115 | 561001.49 |
+------------+---------+-----------+
是什么导致流量和支出总额激增?您可以使用cte
with cte as
(
SELECT
DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start,
SUM(traffic) traffic
FROM
`ga.daily_traffic`
WHERE traffic_date >= '2019-03-04'
),cte2 as
(
SELECT
week_start,
SUM(traffic) traffic
FROM
`marketing.channel_spend`
WHERE week_start = '2019-03-04'
) select cte.week_start,cte.traffic,cte2.traffic as chanel_traffic from cte left join cte2 on cte.week_start=cte2.week_start
你可以用cte
with cte as
(
SELECT
DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start,
SUM(traffic) traffic
FROM
`ga.daily_traffic`
WHERE traffic_date >= '2019-03-04'
),cte2 as
(
SELECT
week_start,
SUM(traffic) traffic
FROM
`marketing.channel_spend`
WHERE week_start = '2019-03-04'
) select cte.week_start,cte.traffic,cte2.traffic as chanel_traffic from cte left join cte2 on cte.week_start=cte2.week_start
戈登是对的。您很可能在
营销.渠道支出
和ga.每日流量
表之间存在多对一或多对多关系。在这种情况下,在这两个表中出现两个或多个相同的日期,将在第一个表中的每个事件上与第二个表中的每个事件产生联接。这会破坏你的结果。您应该聚合预联接,以便在该日期进行一对一联接,这意味着不会有重复的联接
SELECT
chan.week_start,
chan.spend spend,
ga.traffic traffic
FROM (
SELECT
SUM(traffic) traffic,
DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) ga_date
FROM
`ga.daily_traffic`
GROUP BY
ga_date
) ga
LEFT JOIN (
SELECT
SUM(spend) spend,
week_start
FROM
`marketing.channel_spend`
GROUP BY
week_start
) chan ON ga.ga_date = chan.week_start
WHERE chan.week_start = '2019-03-04'
戈登是对的。您很可能在
营销.渠道支出
和ga.每日流量
表之间存在多对一或多对多关系。在这种情况下,在这两个表中出现两个或多个相同的日期,将在第一个表中的每个事件上与第二个表中的每个事件产生联接。这会破坏你的结果。您应该聚合预联接,以便在该日期进行一对一联接,这意味着不会有重复的联接
SELECT
chan.week_start,
chan.spend spend,
ga.traffic traffic
FROM (
SELECT
SUM(traffic) traffic,
DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) ga_date
FROM
`ga.daily_traffic`
GROUP BY
ga_date
) ga
LEFT JOIN (
SELECT
SUM(spend) spend,
week_start
FROM
`marketing.channel_spend`
GROUP BY
week_start
) chan ON ga.ga_date = chan.week_start
WHERE chan.week_start = '2019-03-04'
您需要在加入之前进行聚合。您的联接正在乘以行数。您需要在联接之前进行聚合。您的联接是行数的乘积。@KarlMarxdown您只能接受一个答案,因为您标记了第二个答案,然后它就没有标记,所以您需要标记第一个给出答案的人answer@KarlMarxdown您只能接受一个答案,因为您标记了第二个答案,然后它是未标记的,所以您需要标记给出答案的第一个答案