PostgreSQL聚合函数和缺少的帧行_Postgresql_Aggregate Functions_Window Functions

PostgreSQL聚合函数和缺少的帧行

postgresql

PostgreSQL聚合函数和缺少的帧行,postgresql,aggregate-functions,window-functions,Postgresql,Aggregate Functions,Window Functions,我试图定义一个PostgreSQL聚合函数，该函数能够识别frame子句中要求的行，但缺少这些行。具体来说，让我们考虑一个聚合函数 FrAMER < /Cord>，它的工作是返回一个数组，该数组由通过它聚合的值组成，其中返回的帧中的任何缺失值作为 null < /代码>。所以 select n, v, framer(v) over (order by v rows between 2 preceding and 2 following) arr from (values

我试图定义一个PostgreSQL聚合函数，该函数能够识别frame子句中要求的行，但缺少这些行。具体来说，让我们考虑一个聚合函数<代码> FrAMER < /Cord>，它的工作是返回一个数组，该数组由通过它聚合的值组成，其中返回的帧中的任何缺失值作为<代码> null < /代码>。所以

select
    n,
    v,
    framer(v) over (order by v rows between 2 preceding and 2 following) arr
from (values (1, 3200), (2, 2400), (3, 1600), (4, 2900), (5, 8200)) as v (n, v)
order by v

应该回来

"n" "v" "arr"
3   1600    {null,null,1600,2400,2900}
2   2400    {null,1600,2400,2900,3200}
4   2900    {1600,2400,2900,3200,8200}
1   3200    {2400,2900,3200,8200,null}
5   8200    {2900,3200,8200,null,null}

基本上，我想抓住每个值周围的一系列值，知道我是否遗漏了左边或右边的值（或者两者都遗漏了），这对我来说很重要。看起来很简单。我希望这样的事情能奏效：

create aggregate framer(anyelement) (
    sfunc = array_append,
    stype = anyarray,
    initcond = '{}'
);

但它又回来了

"n" "v" "arr"
3   1600    {1600,2400,2900}
2   2400    {1600,2400,2900,3200}
4   2900    {1600,2400,2900,3200,8200}
1   3200    {2400,2900,3200,8200}
5   8200    {2900,3200,8200}

因此，当缺少两个值时，

sfunc

实际上只被调用三次

我想不出任何非荒谬的方法来捕捉那些丢失的行。似乎应该有一个简单的解决方案，比如在聚合运行之前，以某种方式为数据预加/附加一些sentinel null，或者以某种方式将索引（和帧值）以及实际值传递给函数

我想把它作为一个聚合来实现，因为它为我想做的事情提供了最好的面向用户的体验。还有更好的办法吗

FWIW，我是9.6级的学生。

好的，这是一个有趣的问题。：）

我创建了一个聚合

framer（anyarray，anyelement，int）

，这样我们就可以定义数组大小取决于窗口大小

首先，我们用自己的

framer\u msfunc

替换

array\u append

：

CREATE OR REPLACE FUNCTION public.framer_msfunc(arr anyarray, val anyelement, size_ integer)
 RETURNS anyarray
 LANGUAGE plpgsql
AS $function$
DECLARE
    result ALIAS FOR $0;
    null_ val%TYPE := NULL; -- NULL of the same type as `val`
BEGIN

    IF COALESCE(array_length(arr, 1), 0) = 0 THEN
        -- create an array of nulls with the size of `size_`
        result := array_fill(null_, ARRAY[size_]);
    ELSE
        result := arr;
    END IF;

    IF result[size_] IS NULL THEN
        -- first run or after `minvfunc`.
        -- a NULL inserted at the end in `minvfunc` so we want to replace that.
        result[size_] := val;
    ELSE
        -- `minvfunc` not yet called so we just append and drop the first.
        result := (array_append(result, val))[2:];
    END IF;

    RETURN result;

END;
$function$

然后，我们创建移动聚合所需的

minvfunc

CREATE OR REPLACE FUNCTION public.framer_minvfunc(arr anyarray, val anyelement, size_ integer)
 RETURNS anyarray
 LANGUAGE plpgsql
AS $function$
BEGIN

    -- drop the first in the array and append a null
    RETURN array_append(arr[2:], NULL);

END;
$function$

然后，我们使用移动聚合参数定义聚合：

create aggregate framer(anyelement, int) (
    sfunc = framer_msfunc,
    stype = anyarray,
    msfunc = framer_msfunc,
    mstype = anyarray,
    minvfunc = framer_minvfunc,
    minitcond = '{}'
);

由于需要

sfunc

，因此我们也将

framer\u msfunc

设置为

sfunc

，但它并没有真正起作用。它可以用一个函数替换相同的参数，但实际上只是在内部调用

array\u append

，因此它实际上会做一些有用的事情

这是您的示例，但还有几个输入值

框架大小应至少为窗口大小。对于较小的尺寸，它实际上不起作用

select
    n,
    v,
    framer(v, 5) over (order by v rows between 2 preceding and 2 following) arr
from (values (1, 3200), (2, 2400), (3, 1600), (4, 2900), (5, 8200), (6, 2333), (7, 1500)) as v (n, v)
order by v
;
 n |  v   |            arr
---+------+----------------------------
 7 | 1500 | {NULL,NULL,1500,1600,2333}
 3 | 1600 | {NULL,1500,1600,2333,2400}
 6 | 2333 | {1500,1600,2333,2400,2900}
 2 | 2400 | {1600,2333,2400,2900,3200}
 4 | 2900 | {2333,2400,2900,3200,8200}
 1 | 3200 | {2400,2900,3200,8200,NULL}
 5 | 8200 | {2900,3200,8200,NULL,NULL}
(7 rows)

如果能从窗户的大小推断出尺寸就好了，

但是我不知道是否可以这样做。

我真的很喜欢这个，但是当

size\uu

大于输入量时，它就不起作用了。考虑<代码>选择n，v，FrAMER（V，3）超过（由前1和1后面的V行顺序）ARR（值（1, 32），（2, 24））为V（n，v）阶V；code>应返回

{null，24，32}，{24，32，null}

，但应返回

{null，24，32}，{null，24，32}

。Postgres调用了两次framer\u msfunc来生成第一个结果，而不再为第二个结果重复使用答案。因为Postgres愿意缓存结果，所以我想知道是否还有其他一些技巧案例会破坏它。啊。。。找不到任何将跳过具有窗口聚合的缓存的内容。。。看起来用C编写的自定义窗口函数可能是解决方案！如果您不介意澄清移动聚合函数中的postges缓存到底是什么？