Google cloud platform 大查询字符串\u AGG()多个字段产生错误
我的问题是这样的Google cloud platform 大查询字符串\u AGG()多个字段产生错误,google-cloud-platform,google-bigquery,Google Cloud Platform,Google Bigquery,我的问题是这样的 SELECT DISTINCT id, STRING_AGG(DISTINCT column1) AS mobile, STRING_AGG(DISTINCT country) AS country, STRING_AGG(DISTINCT language) AS language, STRING_AGG(DISTINCT address) AS address, STRING_AGG(DISTINCT model) AS m
SELECT
DISTINCT id,
STRING_AGG(DISTINCT column1) AS mobile,
STRING_AGG(DISTINCT country) AS country,
STRING_AGG(DISTINCT language) AS language,
STRING_AGG(DISTINCT address) AS address,
STRING_AGG(DISTINCT model) AS model,
STRING_AGG(DISTINCT car) AS car,
STRING_AGG(DISTINCT class) AS class,
home_email,
buisness_email,
MAX(timestamp) AS timestamp
FROM
user
GROUP BY
id,
home_email,
buisness_email
在bigquery中对我的2TB表运行此查询并设置查询设置以将输出导出到表时,会产生错误
注意:当我在500 GB数据大小上运行它时,它工作正常
查询执行期间超出了资源:您的项目或组织超出了洗牌操作可用的最大磁盘和内存限制。考虑提供更多的时隙,减少查询并发性,或者在这个工作中使用更有效的逻辑。
那么我该如何解决这个问题呢
此外,在此之后,我将需要对一个this table union和另一个1TB表运行相同的查询,我可以执行一个命题(但我不知道它是否有效,我没有足够大的数据集。因此,如果没有,我将删除此“回答-尝试”) 其思想是将agg拆分为不同的子查询。然后合并所有这些子部分
WITH agg_mobile AS (
SELECT
DISTINCT id,
STRING_AGG(DISTINCT column1) AS mobile,
home_email,
buisness_email,
MAX(timestamp) AS timestamp
FROM
user
GROUP BY
id,
home_email,
buisness_email
),
agg_country AS (
SELECT
DISTINCT id,
STRING_AGG(DISTINCT country) AS country,
FROM
user
GROUP BY
id,
home_email,
buisness_email
)
....
SELECT agg_mobile.*,
agg_country.country,
....
FROM agg_mobile
LEFT JOIN agg_country ON agg_mobile.id = agg_country.id
LEFT JOIN .....
非常感谢你。这是个好主意。我将尝试此操作,并让您知道结果。我已尝试将此SELECT id、ARRAY_AGG(不同的学生id)作为学生id、MAX(日期)作为学生的日期、UNNEST(学生id)作为学生id,其中学生id!=“1111”按id分组;------但仍然不起作用