Sql 在大查询中连接多个表

Sql 在大查询中连接多个表,sql,google-bigquery,bigquery-standard-sql,Sql,Google Bigquery,Bigquery Standard Sql,我想在BigQuery中连接多个表,但的解决方案无法帮助我获得所需的输出 我的出发点如下。我正在创建5个单独的表,这些表显示特定页面可能的每个评级值。请参见此处的示例输出: 该表按以下方式创建: #standardSQL CREATE TEMPORARY FUNCTION tables_in_range(suffix STRING) AS (suffix BETWEEN ( SELECT FORMAT_DATE('%y%m%d', DATE('2018-

我想在BigQuery中连接多个表,但的解决方案无法帮助我获得所需的输出

我的出发点如下。我正在创建5个单独的表,这些表显示特定页面可能的每个评级值。请参见此处的示例输出:

该表按以下方式创建:

#standardSQL
  CREATE TEMPORARY FUNCTION tables_in_range(suffix STRING) AS (suffix BETWEEN (
    SELECT
      FORMAT_DATE('%y%m%d',
        DATE('2018-06-01')))
    AND (
    SELECT
      FORMAT_DATE('%y%m%d',
        DATE('2018-06-30'))));

SELECT
  h.page.pagePath AS page,
  Count(h.eventInfo.eventLabel)as five_star
FROM
  `table.ga_sessions_20*` AS t,
  t.hits AS h
WHERE
  h.eventInfo.eventAction='rating'
  AND h.eventInfo.eventLabel ='5'
  AND tables_in_range(_TABLE_SUFFIX)
  AND REGEXP_CONTAINS(h.page.pagePath,
    r'/xyz/')
  AND h.type='EVENT'
group by 1
当按照这里所描述的方式加入表时,我很遗憾没有得到预期的结果。 相反,联接只查看所有5个表都有共同点的页面,这意味着这些页面在5个可能值中的每一个都有一个从1到5的评级。请参见下面的输出示例。

我希望通过join实现如下表: .
我看到的问题是,如果一个页面没有收到一定的评级,它将不会加入查询atm。不幸的是,我无法找到一个解决方案与联合所有,交叉加入或左加入,所以我非常感谢任何支持这里

下面是BigQuery标准SQL

#standardSQL
SELECT 
  page,
  SUM(five_star_rating) five_star_rating,
  SUM(four_star_rating) four_star_rating,
  SUM(three_star_rating) three_star_rating,
  SUM(two_star_rating) two_star_rating,
  SUM(one_star_rating) one_star_rating
FROM (
  SELECT page, 0 one_star_rating, 0 two_star_rating, 0 three_star_rating, 0 four_star_rating, five_star_rating FROM `project.dataset.table5` UNION ALL
  SELECT page, 0, 0, 0, four_star_rating, 0 FROM `project.dataset.table4` UNION ALL
  SELECT page, 0, 0, three_star_rating, 0, 0 FROM `project.dataset.table3` UNION ALL
  SELECT page, 0, two_star_rating, 0, 0, 0 FROM `project.dataset.table2` UNION ALL
  SELECT page, one_star_rating, 0, 0, 0, 0 FROM `project.dataset.table1` 
)
GROUP BY page
您可以使用以下问题中的虚拟数据来测试、播放上述内容

#standardSQL
WITH `project.dataset.table5` AS (
  SELECT 'A' page, 1 five_star_rating UNION ALL
  SELECT 'B', 1 UNION ALL
  SELECT 'C', 1 
), `project.dataset.table4` AS (
  SELECT 'C' page, 1 four_star_rating UNION ALL
  SELECT 'D', 1 UNION ALL
  SELECT 'F', 1 
), `project.dataset.table3` AS (
  SELECT 'F' page, 1 three_star_rating UNION ALL
  SELECT 'G', 1 UNION ALL
  SELECT 'H', 1 
), `project.dataset.table2` AS (
  SELECT 'H' page, 1 two_star_rating UNION ALL
  SELECT 'I', 1 UNION ALL
  SELECT 'J', 1 
), `project.dataset.table1` AS (
  SELECT 'J' page, 1 one_star_rating UNION ALL
  SELECT 'K', 1 UNION ALL
  SELECT 'L', 1 
)
SELECT 
  page,
  SUM(five_star_rating) five_star_rating,
  SUM(four_star_rating) four_star_rating,
  SUM(three_star_rating) three_star_rating,
  SUM(two_star_rating) two_star_rating,
  SUM(one_star_rating) one_star_rating
FROM (
  SELECT page, 0 one_star_rating, 0 two_star_rating, 0 three_star_rating, 0 four_star_rating, five_star_rating FROM `project.dataset.table5` UNION ALL
  SELECT page, 0, 0, 0, four_star_rating, 0 FROM `project.dataset.table4` UNION ALL
  SELECT page, 0, 0, three_star_rating, 0, 0 FROM `project.dataset.table3` UNION ALL
  SELECT page, 0, two_star_rating, 0, 0, 0 FROM `project.dataset.table2` UNION ALL
  SELECT page, one_star_rating, 0, 0, 0, 0 FROM `project.dataset.table1` 
)
GROUP BY page
#standardSQL
WITH `project.dataset.table5` AS (
  SELECT 'A' page, 1 five_star_rating UNION ALL
  SELECT 'B', 1 UNION ALL
  SELECT 'C', 1 
), `project.dataset.table4` AS (
  SELECT 'C' page, 1 four_star_rating UNION ALL
  SELECT 'D', 1 UNION ALL
  SELECT 'F', 1 
), `project.dataset.table3` AS (
  SELECT 'F' page, 1 three_star_rating UNION ALL
  SELECT 'G', 1 UNION ALL
  SELECT 'H', 1 
), `project.dataset.table2` AS (
  SELECT 'H' page, 1 two_star_rating UNION ALL
  SELECT 'I', 1 UNION ALL
  SELECT 'J', 1 
), `project.dataset.table1` AS (
  SELECT 'J' page, 1 one_star_rating UNION ALL
  SELECT 'K', 1 UNION ALL
  SELECT 'L', 1 
)
SELECT
  COALESCE(five_star.page, four_star.page, three_star.page, two_star.page, one_star.page) AS page,
  IFNULL(five_star.five_star_rating, 0) AS five_star,
  IFNULL(four_star.four_star_rating, 0) AS four_star,
  IFNULL(three_star.three_star_rating, 0) AS three_star,
  IFNULL(two_star.two_star_rating, 0) AS two_star,
  IFNULL(one_star.one_star_rating, 0) AS one_star
FROM `project.dataset.table5` five_star
FULL JOIN `project.dataset.table4` four_star USING (page)
FULL JOIN `project.dataset.table3` three_star USING (page)
FULL JOIN `project.dataset.table2` two_star USING (page)
FULL JOIN `project.dataset.table1` one_star USING (page)   
不幸的是,我无法找到一个解决方案与联合所有,交叉连接或左连接

另一个选项是使用完全联接,如下例所示

#standardSQL
SELECT
  COALESCE(five_star.page, four_star.page, three_star.page, two_star.page, one_star.page) AS page,
  IFNULL(five_star.five_star_rating, 0) AS five_star,
  IFNULL(four_star.four_star_rating, 0) AS four_star,
  IFNULL(three_star.three_star_rating, 0) AS three_star,
  IFNULL(two_star.two_star_rating, 0) AS two_star,
  IFNULL(one_star.one_star_rating, 0) AS one_star
FROM `project.dataset.table5` five_star
FULL JOIN `project.dataset.table4` four_star USING (page)
FULL JOIN `project.dataset.table3` three_star USING (page)
FULL JOIN `project.dataset.table2` two_star USING (page)
FULL JOIN `project.dataset.table1` one_star USING (page)
您可以使用以下问题中的虚拟数据来测试、播放上述内容

#standardSQL
WITH `project.dataset.table5` AS (
  SELECT 'A' page, 1 five_star_rating UNION ALL
  SELECT 'B', 1 UNION ALL
  SELECT 'C', 1 
), `project.dataset.table4` AS (
  SELECT 'C' page, 1 four_star_rating UNION ALL
  SELECT 'D', 1 UNION ALL
  SELECT 'F', 1 
), `project.dataset.table3` AS (
  SELECT 'F' page, 1 three_star_rating UNION ALL
  SELECT 'G', 1 UNION ALL
  SELECT 'H', 1 
), `project.dataset.table2` AS (
  SELECT 'H' page, 1 two_star_rating UNION ALL
  SELECT 'I', 1 UNION ALL
  SELECT 'J', 1 
), `project.dataset.table1` AS (
  SELECT 'J' page, 1 one_star_rating UNION ALL
  SELECT 'K', 1 UNION ALL
  SELECT 'L', 1 
)
SELECT 
  page,
  SUM(five_star_rating) five_star_rating,
  SUM(four_star_rating) four_star_rating,
  SUM(three_star_rating) three_star_rating,
  SUM(two_star_rating) two_star_rating,
  SUM(one_star_rating) one_star_rating
FROM (
  SELECT page, 0 one_star_rating, 0 two_star_rating, 0 three_star_rating, 0 four_star_rating, five_star_rating FROM `project.dataset.table5` UNION ALL
  SELECT page, 0, 0, 0, four_star_rating, 0 FROM `project.dataset.table4` UNION ALL
  SELECT page, 0, 0, three_star_rating, 0, 0 FROM `project.dataset.table3` UNION ALL
  SELECT page, 0, two_star_rating, 0, 0, 0 FROM `project.dataset.table2` UNION ALL
  SELECT page, one_star_rating, 0, 0, 0, 0 FROM `project.dataset.table1` 
)
GROUP BY page
#standardSQL
WITH `project.dataset.table5` AS (
  SELECT 'A' page, 1 five_star_rating UNION ALL
  SELECT 'B', 1 UNION ALL
  SELECT 'C', 1 
), `project.dataset.table4` AS (
  SELECT 'C' page, 1 four_star_rating UNION ALL
  SELECT 'D', 1 UNION ALL
  SELECT 'F', 1 
), `project.dataset.table3` AS (
  SELECT 'F' page, 1 three_star_rating UNION ALL
  SELECT 'G', 1 UNION ALL
  SELECT 'H', 1 
), `project.dataset.table2` AS (
  SELECT 'H' page, 1 two_star_rating UNION ALL
  SELECT 'I', 1 UNION ALL
  SELECT 'J', 1 
), `project.dataset.table1` AS (
  SELECT 'J' page, 1 one_star_rating UNION ALL
  SELECT 'K', 1 UNION ALL
  SELECT 'L', 1 
)
SELECT
  COALESCE(five_star.page, four_star.page, three_star.page, two_star.page, one_star.page) AS page,
  IFNULL(five_star.five_star_rating, 0) AS five_star,
  IFNULL(four_star.four_star_rating, 0) AS four_star,
  IFNULL(three_star.three_star_rating, 0) AS three_star,
  IFNULL(two_star.two_star_rating, 0) AS two_star,
  IFNULL(one_star.one_star_rating, 0) AS one_star
FROM `project.dataset.table5` five_star
FULL JOIN `project.dataset.table4` four_star USING (page)
FULL JOIN `project.dataset.table3` three_star USING (page)
FULL JOIN `project.dataset.table2` two_star USING (page)
FULL JOIN `project.dataset.table1` one_star USING (page)   
结果如预期:

Row page    five_star   four_star   three_star  two_star    one_star     
1   A       1           0           0           0           1    
2   B       1           0           0           0           1    
3   C       1           1           0           0           1    
4   D       0           1           0           0           0    
5   F       0           1           1           0           0    
6   G       0           0           1           0           0    
7   H       0           0           1           1           0    
8   I       0           0           0           1           0    
9   J       0           0           0           1           0    

您查询中的问题是:您只添加到那些具有5星级评级的事件页面。这就是为什么建议这样做——它会将新行添加到最左边的表中

我认为在您的情况下,解决方案更简单,根本不需要连接,因为所有数据都在同一个表中。 这是一个平面和非轴:

#standardSQL
  CREATE TEMPORARY FUNCTION tables_in_range(suffix STRING) AS (suffix BETWEEN '20180601' AND '20180630');

SELECT
  h.page.pagePath AS page,
  h.eventInfo.eventLabel stars,
  COUNT(1) as events
FROM
  `project.dataset.ga_sessions_*` AS t, t.hits AS h
WHERE
  h.eventInfo.eventAction='rating'
  AND h.eventInfo.eventLabel between '1' and '5'
  AND tables_in_range(_TABLE_SUFFIX)
  AND REGEXP_CONTAINS(h.page.pagePath,
    r'/xyz/')
  AND h.type='EVENT'
GROUP BY 1, 2
如果您确实需要类似枢轴的列,它将如下所示:

#standardSQL
  CREATE TEMPORARY FUNCTION tables_in_range(suffix STRING) AS (suffix BETWEEN '20180601' AND '20180630');

SELECT
  h.page.pagePath AS page,
  SUM( IF(h.eventInfo.eventLabel = '1', 1, 0) ) as oneStarEvents,
  SUM( IF(h.eventInfo.eventLabel = '2', 1, 0) ) as twoStarEvents,
  SUM( IF(h.eventInfo.eventLabel = '3', 1, 0) ) as threeStarEvents,
  SUM( IF(h.eventInfo.eventLabel = '4', 1, 0) ) as fourStarEvents,
  SUM( IF(h.eventInfo.eventLabel = '5', 1, 0) ) as fiveStarEvents
FROM
  `project.dataset.ga_sessions_*` AS t, t.hits AS h
WHERE
  h.eventInfo.eventAction='rating'
  AND h.eventInfo.eventLabel between '1' and '5'
  AND tables_in_range(_TABLE_SUFFIX)
  AND REGEXP_CONTAINS(h.page.pagePath,
    r'/xyz/')
  AND h.type='EVENT'
GROUP BY 1
除了
SUM(IF(条件,1,0))
之外,您还可以
COUNT(IF(条件,1,NULL))


尝试左外连接。列出所有页面的表将位于联接表达式的左侧。然后使用IsNull将空值转换为零。大量在线教程和示例。这里有一个例子:谢谢你把这篇文章转发给我!亲爱的米哈伊尔,非常感谢你的两个回复。我刚刚尝试了完整连接的解决方案,效果非常好!我也会用subselect/UNION ALL来尝试你的例子,但到目前为止,这正是我要找的表。非常感谢。