Google bigquery BigQuery中的percent_rank(),条件为仅包含某些行

Google bigquery BigQuery中的percent_rank(),条件为仅包含某些行,google-bigquery,Google Bigquery,我以前发过。此问题的解决方案适用于分析函数rank(),但不适用于percent\u rank()。为了演示,我有以下虚拟表格: with table as ( select 'a' as category, 1 as num, 15 as num2, 7 as cutoff union all select 'a' as category, 2 as num, 15 as num2, 7 as cutoff union all select 'a' as cat

我以前发过。此问题的解决方案适用于分析函数
rank()
,但不适用于
percent\u rank()
。为了演示,我有以下虚拟表格:

with 
  table as (
    select 'a' as category, 1 as num, 15 as num2, 7 as cutoff union all 
    select 'a' as category, 2 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 3 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 4 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 5 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 6 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 7 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 8 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 9 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 10 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 11 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 12 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 13 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 14 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 15 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 16 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 17 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 18 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 19 as num, 15 as num2, 7 as cutoff union all
    select 'a' as category, 20 as num, 5 as num2, 7 as cutoff union all
    select 'a' as category, 21 as num, 5 as num2, 7 as cutoff 
  )
num
列需要
percent\u rank()
。但是,百分位排名只应考虑
num2>截止值
的行。我尝试了以下两种方法来计算百分位数,并给出了结果:

select
  *,
  if(num2 >= cutoff,
      percent_rank() over(
        partition by category
        order by num
      ), null) as pctile1,
  if(num2 >= cutoff,
      percent_rank() over(
        partition by category
        order by if (num2 >= cutoff, num, null) ASC
      ), null) as pctile2
from table
order by num asc

pctile1
pctile2
都不正确。要说明为什么会出现这种情况,请查看第10行,该行具有
pctile1==0.45
pctile2==0.60
。然而,在合格值中,这应该是一个较低的百分位数。只有2个符合条件的值低于
num==10
(即1和2),而10以上的许多值符合条件(11-19)。给定
num==10
cutoff
值,正确的
num==10的百分位数应该接近30%,因为
10
是11个限定值中的第三个最低值

请注意,我不应
筛选表以删除我未
percent_rank()“覆盖”的行,因为我需要保留这些行

编辑
我不知道如何缩小图像大小,但我目前正在尝试这样做。

我只想使用下面的选项

#standardSQL
SELECT *,
  PERCENT_RANK() OVER(PARTITION BY category ORDER BY num) AS pctile
FROM table WHERE num2 >= cutoff
UNION ALL
SELECT *, NULL
FROM table WHERE num2 < cutoff
-- ORDER BY num
在我看来,上面的内容很容易阅读,但下面的内容很可能是你想要的

SELECT *,
  IF(num2 >= cutoff, 
    PERCENT_RANK() OVER(PARTITION BY IF(num2 >= cutoff, category, NULL) ORDER BY num), 
    NULL) AS pctile
FROM table
-- ORDER BY num

显然,与上面的结果相同

ahh,我在
orderby
中使用了
if
条件,而不是
分区
。谢谢分享/更正。
SELECT *,
  IF(num2 >= cutoff, 
    PERCENT_RANK() OVER(PARTITION BY IF(num2 >= cutoff, category, NULL) ORDER BY num), 
    NULL) AS pctile
FROM table
-- ORDER BY num