Google bigquery 响应太大,无法返回限制1;
我在玩bigquery时遇到了一个问题,我的问题是:Google bigquery 响应太大,无法返回限制1;,google-bigquery,Google Bigquery,我在玩bigquery时遇到了一个问题,我的问题是: SELECT * FROM ( SELECT a.title, a.counter , MAX(b.num_characters) as max FROM ( SELECT title, count(*) as counter FROM publicdata:samples.wikipedia GROUP EACH BY title ORDER BY counter DESC LIMIT 10 ) a JOIN (SELEC
SELECT * FROM (
SELECT a.title, a.counter , MAX(b.num_characters) as max
FROM (
SELECT title, count(*) as counter FROM publicdata:samples.wikipedia
GROUP EACH BY title
ORDER BY counter DESC
LIMIT 10
) a JOIN
(SELECT title,num_characters FROM publicdata:samples.wikipedia
) b ON a.title = b.title
GROUP BY a.title, a.counter)
LIMIT 1;
虽然这是有效的,但我得到的响应太大,无法返回。第一个子查询运行良好,我要做的是为它获取更多的列。但是我失败了。不要担心“极限1”,在到达那个阶段之前,响应会变得太大
尝试跳过第二个子查询,因为它只从大型数据集中选择了2列,而没有对其进行筛选。一个可行的备选方案是:
SELECT
a.title, a.counter, MAX(b.num_characters) AS max
FROM
publicdata:samples.wikipedia b JOIN(
SELECT
title, COUNT(*) AS counter
FROM
publicdata:samples.wikipedia
GROUP EACH BY title
ORDER BY
counter DESC
LIMIT 10) a
ON a.title = b.title
GROUP BY
a.title,
a.counter
这需要15.4秒
我们可以使用TOP()更快地执行此操作:
TOP()作为一个更简单、更快的函数(选择COUNT(*)/GROUP/LIMIT)
现在它只运行6.5s,处理15.9GB
SELECT
a.title title, counter, MAX(num_characters) max
FROM
publicdata:samples.wikipedia b
JOIN
(
SELECT
TOP(title, 10) AS title, COUNT(*) AS counter
FROM
publicdata:samples.wikipedia
) a
ON a.title=b.title
GROUP BY
title, counter