Google bigquery BigQuery-计算多个组中多个列的0-100百分位数
我们有一个bigquery表,如下所示:Google bigquery BigQuery-计算多个组中多个列的0-100百分位数,google-bigquery,percentile,Google Bigquery,Percentile,我们有一个bigquery表,如下所示: with my_data as ( select 1 as num1, 32 as num2, 43 as num3, 'a' as letter union all select 2 as num1, 21 as num2, 45 as num3, 'a' as letter union all select 3 as num1, 99 as num2, 47 as num3, 'a' as letter union all
with
my_data as (
select 1 as num1, 32 as num2, 43 as num3, 'a' as letter union all
select 2 as num1, 21 as num2, 45 as num3, 'a' as letter union all
select 3 as num1, 99 as num2, 47 as num3, 'a' as letter union all
select 4 as num1, 83 as num2, 48 as num3, 'a' as letter union all
select 5 as num1, 55 as num2, 49 as num3, 'a' as letter union all
select 6 as num1, 35 as num2, 51 as num3, 'b' as letter union all
select 7 as num1, 94 as num2, 52 as num3, 'b' as letter union all
select 8 as num1, 17 as num2, 55 as num3, 'b' as letter union all
select 9 as num1, 33 as num2, 56 as num3, 'b' as letter union all
select 10 as num1, 81 as num2, 37 as num3, 'b' as letter union all
select 11 as num1, 42 as num2, 38 as num3, 'a' as letter union all
select 12 as num1, 26 as num2, 39 as num3, 'a' as letter union all
select 13 as num1, 92 as num2, 41 as num3, 'a' as letter union all
select 14 as num1, 38 as num2, 43 as num3, 'a' as letter union all
select 15 as num1, 31 as num2, 46 as num3, 'a' as letter union all
select 16 as num1, 53 as num2, 48 as num3, 'b' as letter union all
select 17 as num1, 49 as num2, 49 as num3, 'b' as letter union all
select 18 as num1, 71 as num2, 51 as num3, 'b' as letter union all
select 19 as num1, 67 as num2, 52 as num3, 'b' as letter union all
select 20 as num1, 62 as num2, 54 as num3, 'b' as letter
)
letter
是分组依据的列,num1、num2、num3
是我们要计算0-100%iles的3列。更清楚地说,我们希望返回一个包含202行和列的表字母pctile value1 value2 value3
<代码>字母是a
(101次)和b
(101次),pctile
从0,1,2,3。。。100,0,1,2,3... 100和值1值2值3
是与第0、第1、第2、第3、第4等百分位相对应的值(对于每个组/字母)
我之前在这里发布了一个非常类似的问题——这里提供了一个有用的解决方案。但是,此解决方案适用于只为单个列计算0-100%ile行的基本情况。现在,在我们数据的真实示例中,我们正在处理多个列。上一篇文章中的解决方案在扩展到包含3列的新数据时不起作用
SELECT letter, pctile, value1, value2, value3
FROM (
SELECT
letter,
APPROX_QUANTILES(num1, 100) AS value1,
APPROX_QUANTILES(num2, 100) AS value2,
APPROX_QUANTILES(num3, 100) AS value3,
FROM my_data
GROUP BY letter
) as t,
t.value1 WITH OFFSET AS pctile
从技术上讲,这会返回202行,但是value2
和value3
的每行中的值不是单独的值,而是长度==100的整个数组。我尝试了不同的方法来获得所需的结果(202行,每行都有value1 value2 value3
)但没有成功。这可能吗?试试下面的方法
SELECT letter, pctile, value1, value2, value3
FROM (
SELECT
letter,
APPROX_QUANTILES(num1, 100) AS value1,
APPROX_QUANTILES(num2, 100) AS value2,
APPROX_QUANTILES(num3, 100) AS value3,
FROM my_data
GROUP BY letter
) as t
,t.value1 WITH OFFSET AS pctile
,t.value2 WITH OFFSET AS pctile2
,t.value3 WITH OFFSET AS pctile3
WHERE pctile = pctile2
AND pctile = pctile3
@Canovice-对你有用吗?是的,这很有用,谢谢