Google bigquery 来自UNNEST集合的最后一个值

Google bigquery 来自UNNEST集合的最后一个值,google-bigquery,gcloud,Google Bigquery,Gcloud,我面临以下查询的问题: SELECT project.id as id, (SELECT value FROM UNNEST(project.labels) WHERE key="key1") as key1, (SELECT value FROM UNNEST(project.labels) WHERE key="key2") as key2, ROUND(SUM(cost), 2) as charges FROM `cloud.

我面临以下查询的问题:

SELECT
   project.id as id,
   (SELECT value FROM UNNEST(project.labels) WHERE key="key1") as key1,
   (SELECT value FROM UNNEST(project.labels) WHERE key="key2") as key2,
   ROUND(SUM(cost), 2) as charges
FROM `cloud.billing.data_123`
WHERE project.id is not null and EXTRACT(MONTH FROM usage_start_time) = 6 and EXTRACT(YEAR FROM usage_start_time) = 2020
GROUP BY id, key1, key2
ORDER by id
它是每个项目的总浪费量,在上面的例子中是每个月,即2020年的第6个月。此报表基于导出到的帐单报表。结果如下:

Row | id       | key1 | key2 | charges |
1   |project1  | null | null | 32      | 
2   |project1  | x    | y    | 40      |
3   |project2  | null | null | 50      | 
4   |project2  | x    | y    | 10      |

键是项目标签,这是因为标签KEY1和KEY2在月中旬刚刚被添加到项目中。因此,当项目没有标签时,键值为null的第一条记录是总计,当项目有标签时,带有x和y的第二条记录是总计

是否有一种方法可以将所有内容用标签聚集在一行中,并对值求和,如:

Row | id       | key1 | key2 | charges |
1   |project1  | x    | y    | 72      |
2   |project2  | x    | y    | 60      |

提前感谢。

我的理解是,您希望将每个项目的成本和输出id、键1、键2和成本相加,使键1和键2不为空

因此,为了实现这一点,我将提出两种方法,假设每个项目只有一个唯一的键1和一个唯一的键2。换句话说,例如,当project1的key1为null时,它应该是x

第一种方法:当key1和key2值为null时,使用填充

WITH data1 AS (
SELECT
   project.id as id,
   (SELECT value FROM UNNEST(project.labels) WHERE key="key1") as key1,
   (SELECT value FROM UNNEST(project.labels) WHERE key="key2") as key2,
   cost
FROM `cloud.billing.data_123`
WHERE project.id is not null and EXTRACT(MONTH FROM usage_start_time) = 6 and EXTRACT(YEAR FROM usage_start_time) = 2020
GROUP BY id, project, ar, activity
ORDER by id, project
),
data2 AS(
SELECT id, 
FIRST_VALUE(key1 IGNORE NULLS) OVER (PARTITION BY id ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS key1,
FIRST_VALUE(key2 IGNORE NULLS) OVER (PARTITION BY id ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) AS key2
cost
)
SELECT id, key1,key2, ROUND(SUM(cost),2) AS charges FROM data2
GROUP BY id, key1,key2
请注意,第一个_值与IGNORE NULLS一起使用,它在指定分区内查找key1和key2的下一个可用值。因此,可以对按id、键1和键2分组的成本求和

第二种方法:使用和

该思想与第一种方法相同,替换key1和key2的空值,然后对每个项目的成本求和

两者的输出

Row | id       | key1 | key2 | charges |
1   |project1  | x    | y    | 72      |
2   |project2  | x    | y    | 60      |

我认为您提供的问题查询没有机会工作!输入数据的结构也不清楚!你能提供更多细节吗?你好@MikhailBerlyant,查询正在运行,并返回我在问题描述中添加的结果。cloud.billing.data_123是BigQuery自动生成的账单报告,project.labels是我在GCloud项目中添加的标签。这些就是你说的细节吗?如果我没有说清楚的话,我很抱歉。我只是不知道在没有抛出错误的情况下它是如何工作的。也许有人会帮我第二种方法对我很有效。非常感谢,Alexandre!!!
Row | id       | key1 | key2 | charges |
1   |project1  | x    | y    | 72      |
2   |project2  | x    | y    | 60      |