Arrays BigQuery-按数组分组
我想按数组分组 示例查询:Arrays BigQuery-按数组分组,arrays,group-by,google-cloud-platform,google-bigquery,Arrays,Group By,Google Cloud Platform,Google Bigquery,我想按数组分组 示例查询: #standardSQL WITH `project.dataset.table` AS ( SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL SELECT 'compute' description, '[{"key"
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^\[{|}]$', ''), '}')]) kv
) labels,
cost
FROM `project.dataset.table`
上述查询的结果:
Row description labels.key labels.value cost
1 compute application scaled-server 0.323316
department hrd
2 compute application scaled-server 0.342825
department hrd
我希望得到如下结果:
Row description labels.key labels.value cost
1 compute application scaled-server 0.666141
department hrd
分组的逻辑是什么?是吗?所有键和值都应该相同才能分组?我刚刚更新了这个问题,目的是,“描述”列有VM、cloudSQL、BQ这样的服务,而“标签”列告诉我们有关标签的信息,如VM标签(env=prod),所以,如果我想得到具有env=staging的compute或vm的成本,那么就结束这一天。这就是为什么我需要先按描述分组,然后按键值分组
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.323316 cost UNION ALL
SELECT 'compute' description, '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]' labels, 0.342825 cost
), temp AS (
SELECT description, labels, SUM(cost) AS cost
FROM `project.dataset.table`
GROUP BY description, labels
)
SELECT
description,
ARRAY(
SELECT AS STRUCT
JSON_EXTRACT_SCALAR(kv, '$.key') key,
JSON_EXTRACT_SCALAR(kv, '$.value') value
FROM UNNEST(SPLIT(labels, '},{')) kv_temp,
UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^\[{|}]$', ''), '}')]) kv
) labels,
cost
FROM temp