Google bigquery 如何使用GROUP_CONCAT和/或NEST,但限制结果中的元素数量

Google bigquery 如何使用GROUP_CONCAT和/或NEST,但限制结果中的元素数量,google-bigquery,Google Bigquery,我定期聚合用户表的指标,其中每个聚合组bucket的大小大约为数万。我希望能够为每个bucket提供几个示例用户,但不能提供长度为数千或可能为数百万的GROUP_CONCAT。这在传统SQL聚合函数中是不可能的,但在和它的聚合函数STRING_AGG对应于传统SQL中的GROUP_CONCAT,而ARRAY_AGG对应于NEST时是可能的。这两个函数都支持可选的LIMIT子句,如中所述 例如: select string_agg(x LIMIT 2) from unnest(['hello',

我定期聚合用户表的指标,其中每个聚合组bucket的大小大约为数万。我希望能够为每个bucket提供几个示例用户,但不能提供长度为数千或可能为数百万的GROUP_CONCAT。

这在传统SQL聚合函数中是不可能的,但在和它的聚合函数STRING_AGG对应于传统SQL中的GROUP_CONCAT,而ARRAY_AGG对应于NEST时是可能的。这两个函数都支持可选的LIMIT子句,如中所述

例如:

select string_agg(x LIMIT 2) 
from unnest(['hello', 'world!', 'foo', 'bar', 'baz']) x
返回“你好,世界!”字符串,以及

select array_agg(x LIMIT 2) 
from unnest(['hello', 'world!', 'foo', 'bar', 'baz']) x
返回['hello','world!']数组。

用于旧版SQL

SELECT NEST(x) FROM (
  SELECT x FROM 
    (SELECT 'hello' AS x), 
    (SELECT 'world!' AS x), 
    (SELECT 'foo' AS x), 
    (SELECT 'bar' AS x), 
    (SELECT 'baz' AS x),
LIMIT 2
) 
他们分别回答“你好,世界!”字符串和['hello','world!']数组

下面是GROUP BY的示例

SELECT id, GROUP_CONCAT_UNQUOTED(x) FROM (
  SELECT id, x, ROW_NUMBER() OVER(PARTITION BY id) AS num FROM 
    (SELECT 1 AS id, 'hello' AS x), 
    (SELECT 1 AS id, 'world!' AS x), 
    (SELECT 1 AS id, 'foo' AS x), 
    (SELECT 1 AS id, 'bar' AS x), 
    (SELECT 1 AS id, 'baz' AS x),
    (SELECT 2 AS id, 'hello2' AS x), 
    (SELECT 2 AS id, 'world2!' AS x), 
    (SELECT 2 AS id, 'foo2' AS x), 
    (SELECT 2 AS id, 'bar2' AS x), 
    (SELECT 2 AS id, 'baz2' AS x),
)  
WHERE num < 3
GROUP BY id

但是,此解决方案不适用于作为原始问题一部分的GROUP BY,但为了简单起见,示例省略了它
SELECT id, GROUP_CONCAT_UNQUOTED(x) FROM (
  SELECT id, x, ROW_NUMBER() OVER(PARTITION BY id) AS num FROM 
    (SELECT 1 AS id, 'hello' AS x), 
    (SELECT 1 AS id, 'world!' AS x), 
    (SELECT 1 AS id, 'foo' AS x), 
    (SELECT 1 AS id, 'bar' AS x), 
    (SELECT 1 AS id, 'baz' AS x),
    (SELECT 2 AS id, 'hello2' AS x), 
    (SELECT 2 AS id, 'world2!' AS x), 
    (SELECT 2 AS id, 'foo2' AS x), 
    (SELECT 2 AS id, 'bar2' AS x), 
    (SELECT 2 AS id, 'baz2' AS x),
)  
WHERE num < 3
GROUP BY id