BigQuery标准SQL按聚合多个列分组
样本数据集:BigQuery标准SQL按聚合多个列分组,sql,google-bigquery,Sql,Google Bigquery,样本数据集: |ownerId|category|aggCategory1|aggCategory2| -------------------------------------------- | 1 | dog | animal | dogs | | 1 | puppy | animal | dogs | | 2 | daisy | flower | ignore | | 3 | rose |
|ownerId|category|aggCategory1|aggCategory2|
--------------------------------------------
| 1 | dog | animal | dogs |
| 1 | puppy | animal | dogs |
| 2 | daisy | flower | ignore |
| 3 | rose | flower | ignore |
| 4 | cat | animal | cats |
...
正在查找包含类别、aggCategory1、aggCategory2中所有者数量的group by,例如输出:
|# of owners|summaryCategory|
-----------------------------
| 1 | dog |
| 1 | puppy |
| 1 | daisy |
| 1 | rose |
| 1 | cat |
| 2 | animal |
| 2 | flower |
| 1 | dogs |
| 2 | ignore |
| 1 | cats |
不必是那种格式,但希望获得上述数据点
谢谢
SELECT COUNT(T.ownerID), T.category
FROM (
SELECT ownerID, category
FROM table
UNION
SELECT ownerID, aggCategory1
FROM table
UNION
SELECT ownerID, aggCategory2
FROM table
) AS T
GROUP BY T.category
使用
分组依据
和与您的所有类别列的并集可以很好。一种方法是使用并集所有
来取消数据的并集,然后在外部查询中进行聚合:
SELECT category, COUNT(*)
FROM (SELECT ownerID, category
FROM t
UNION ALL
SELECT ownerID, aggCategory1
FROM t
UNION ALL
SELECT ownerID, aggCategory2
FROM t
) t
GROUP BY category
更大的查询方式是使用数组:
SELECT cat, COUNT(*)
FROM t CROSS JOIN
UNNEST(ARRAY[category, aggcategory1, aggcategory2]) cat
GROUP BY cat;
使用
union all
with cte as
(
SELECT ownerID, category as summaryCategory
FROM table
UNION
SELECT ownerID, aggCategory1 as summaryCategory
FROM table
UNION
SELECT ownerID, aggCategory2 as summaryCategory
FROM table
) select count(ownerID),summaryCategory from cte group by summaryCategory
您是否查看了
分组依据
和计数
?如果你这样做了,问题是什么?