Sql JOIN正在放大我的总数,尽管它是正确的

Sql JOIN正在放大我的总数,尽管它是正确的,sql,google-bigquery,Sql,Google Bigquery,我将从两个独立的查询开始,这两个查询分别给出正确的结果: SELECT DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start, SUM(traffic) traffic FROM `ga.daily_traffic` WHERE traffic_date >= '2019-03-04' 返回: +--------------+---------+ | traffic_week | traffic | +--------

我将从两个独立的查询开始,这两个查询分别给出正确的结果:

SELECT
  DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start,
  SUM(traffic) traffic
FROM
  `ga.daily_traffic` 
  WHERE traffic_date >= '2019-03-04'
返回:

+--------------+---------+
| traffic_week | traffic |
+--------------+---------+
| 2019-03-04   |   66572 |
+--------------+---------+
第二个问题:

  SELECT
  week_start,
  SUM(traffic) traffic
FROM
 `marketing.channel_spend`
 WHERE week_start = '2019-03-04'
返回:

+------------+----------+
| week_start |  spend   |
+------------+----------+
| 2019-03-04 | 80143.07 |
+------------+----------+
对于第二个查询,我应该注意:字段
week\u start
已经以每周增量存储,这可能是我将这两个字段合并在一起时出现这种情况的原因,例如:

SELECT
  week_start,
  SUM(spend) spend,
  SUM(traffic) traffic
FROM
  `ga.daily_traffic` ga
LEFT JOIN `marketing.channel_spend` chan
ON DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) = chan.week_start
WHERE week_start = '2019-03-04'
GROUP BY 1
ORDER BY 1 DESC
产生以下结果:

+------------+---------+-----------+
| week_start | traffic |   spend   |
+------------+---------+-----------+
| 2019-03-04 |  153115 | 561001.49 |
+------------+---------+-----------+
是什么导致流量和支出总额激增?

您可以使用cte

with cte as
(
SELECT
  DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start,
  SUM(traffic) traffic
FROM
  `ga.daily_traffic` 
  WHERE traffic_date >= '2019-03-04'
),cte2 as
(
SELECT
  week_start,
  SUM(traffic) traffic
FROM
 `marketing.channel_spend`
 WHERE week_start = '2019-03-04'
) select cte.week_start,cte.traffic,cte2.traffic as chanel_traffic  from  cte left join cte2 on cte.week_start=cte2.week_start
你可以用cte

with cte as
(
SELECT
  DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) week_start,
  SUM(traffic) traffic
FROM
  `ga.daily_traffic` 
  WHERE traffic_date >= '2019-03-04'
),cte2 as
(
SELECT
  week_start,
  SUM(traffic) traffic
FROM
 `marketing.channel_spend`
 WHERE week_start = '2019-03-04'
) select cte.week_start,cte.traffic,cte2.traffic as chanel_traffic  from  cte left join cte2 on cte.week_start=cte2.week_start

戈登是对的。您很可能在
营销.渠道支出
ga.每日流量
表之间存在多对一或多对多关系。在这种情况下,在这两个表中出现两个或多个相同的日期,将在第一个表中的每个事件上与第二个表中的每个事件产生联接。这会破坏你的结果。您应该聚合预联接,以便在该日期进行一对一联接,这意味着不会有重复的联接

SELECT
    chan.week_start,
    chan.spend spend,
    ga.traffic traffic
FROM (
    SELECT
        SUM(traffic) traffic,
        DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) ga_date
    FROM
        `ga.daily_traffic` 
    GROUP BY
        ga_date
) ga
LEFT JOIN (
    SELECT
        SUM(spend) spend,
        week_start
    FROM
        `marketing.channel_spend`
    GROUP BY
        week_start
) chan ON ga.ga_date = chan.week_start
WHERE chan.week_start = '2019-03-04'

戈登是对的。您很可能在
营销.渠道支出
ga.每日流量
表之间存在多对一或多对多关系。在这种情况下,在这两个表中出现两个或多个相同的日期,将在第一个表中的每个事件上与第二个表中的每个事件产生联接。这会破坏你的结果。您应该聚合预联接,以便在该日期进行一对一联接,这意味着不会有重复的联接

SELECT
    chan.week_start,
    chan.spend spend,
    ga.traffic traffic
FROM (
    SELECT
        SUM(traffic) traffic,
        DATE_TRUNC(ga.traffic_date, WEEK(MONDAY)) ga_date
    FROM
        `ga.daily_traffic` 
    GROUP BY
        ga_date
) ga
LEFT JOIN (
    SELECT
        SUM(spend) spend,
        week_start
    FROM
        `marketing.channel_spend`
    GROUP BY
        week_start
) chan ON ga.ga_date = chan.week_start
WHERE chan.week_start = '2019-03-04'

您需要在加入之前进行聚合。您的联接正在乘以行数。您需要在联接之前进行聚合。您的联接是行数的乘积。@KarlMarxdown您只能接受一个答案,因为您标记了第二个答案,然后它就没有标记,所以您需要标记第一个给出答案的人answer@KarlMarxdown您只能接受一个答案,因为您标记了第二个答案,然后它是未标记的,所以您需要标记给出答案的第一个答案