Google bigquery BigQuery-无法对作用域聚合使用count distinct
我有一张桌子Google bigquery BigQuery-无法对作用域聚合使用count distinct,google-bigquery,Google Bigquery,我有一张桌子 +--------+------------------+-----------+---------+-------------+ |visit_id|browsed_categories | num_seen| num_borrows | +--------+------------------+-----------+---------+-------------+ |1 | fiction,history | 20
+--------+------------------+-----------+---------+-------------+
|visit_id|browsed_categories | num_seen| num_borrows |
+--------+------------------+-----------+---------+-------------+
|1 | fiction,history | 20 | 3 |
|2 | selfhelp,fiction,science | 15 | 3 |
|3 | cooking,kids,home,selfhelp | 7 | 2 |
+--------+------------------------------+---------+-------------+
我试图总结这张表,看看不同浏览类别的数量和借阅之间是否存在相关性
+-------------+---------------------------------+-------------------------+
| borrow_rate | num_distinct_browsed_categories | distinct_categories |
+-------------+---------------------------------+-------------------------+
| 0 | 3 | cooking,selfhelp,home |
| 1 | 2 | history,fiction |
+-------------+---------------------------------+-------------------------+
我的质询如下:
select
*,
count(distinct(split(all_cats, ','))) as num_distinct_browsed_categories
from
(
select
(num_borrows/num_seen) as borrow_rate,
count(visit_id) as num_visits,
group_concat(browsed_categories, ',') as all_cats
from [table]
group by borrow_rate
)
查询显示以下错误:
Cannot use count distinct with scoped aggregation
如何修改查询以获得所需的输出?下面是BigQuery标准SQL的版本
#standardSQL
SELECT
*,
(SELECT COUNT(DISTINCT cat) FROM UNNEST(SPLIT(all_cats, ',')) cat) AS num_distinct_browsed_categories
FROM (
SELECT
(num_borrows/num_seen) AS borrow_rate,
COUNT(visit_id) AS num_visits,
STRING_AGG(browsed_categories, ',') AS all_cats
FROM `project.dataset.table`
GROUP BY borrow_rate
)
顺便说一句,如果出于某种原因您仍然绑定到BigQuery遗留SQL-只需替换即可
count(distinct(split(all_cats, ',')))
与
exact_count_distinct(split(all_cats, ','))
在您的原始查询中谢谢!使用exact\u count\u distinct()函数有效。是否有方法检查“all\u cats”列是否包含给定值列表中的任何值?