Sql Postgres-按会话聚合用户事件

Sql Postgres-按会话聚合用户事件,sql,postgresql,time-series,Sql,Postgresql,Time Series,我有一个包含如下事件的表:ID、USER\u ID、CREATED\u AT、EVENT\u NAME 我试图获取用户通常在会话中创建的事件序列。当用户事件与上一个事件的间隔超过5分钟时,新会话将启动 我甚至能够创建一个包含以下信息的视图: 按该顺序读取表格,每次_DIFF大于5min时,新会话开始 我现在如何按会话聚合事件,以便最终得到类似的结果 如下表、视图和一些测试数据所示: CREATE SCHEMA test; CREATE TABLE test."TRACKING_EVENTS"

我有一个包含如下事件的表:ID、USER\u ID、CREATED\u AT、EVENT\u NAME

我试图获取用户通常在会话中创建的事件序列。当用户事件与上一个事件的间隔超过5分钟时,新会话将启动

我甚至能够创建一个包含以下信息的视图:

按该顺序读取表格,每次_DIFF大于5min时,新会话开始

我现在如何按会话聚合事件,以便最终得到类似的结果

如下表、视图和一些测试数据所示:

CREATE SCHEMA test;

CREATE TABLE test."TRACKING_EVENTS" (
    "ID" serial PRIMARY key,
    "USER_ID" text,
    "CREATED_AT" TIMESTAMP,
    "EVENT_NAME" text
);

CREATE VIEW
    test."ORDERED_EVENTS"
AS
    SELECT 
        "ID", 
        "USER_ID", 
        "CREATED_AT" AS "EVENT_TIME", 
        "EVENT_NAME",
        CASE WHEN 
            lag("CREATED_AT", 1) OVER (ORDER BY "USER_ID", "CREATED_AT") < "CREATED_AT" 
        THEN 
            lag("CREATED_AT", 1) OVER (ORDER BY "USER_ID", "CREATED_AT")
        ELSE
            NULL 
        END AS "PREVIOUS_EVENT_TIME" 
    FROM 
        test."TRACKING_EVENTS";

CREATE VIEW
    test."ORDERED_EVENTS_WITH_DIFF"
AS
    SELECT  
        "ID", 
        "USER_ID", 
        "EVENT_TIME", 
        "EVENT_NAME",
        "PREVIOUS_EVENT_TIME",
        "EVENT_TIME" - "PREVIOUS_EVENT_TIME" AS "TIME_DIFF"
    FROM 
        test."ORDERED_EVENTS";

-- Period 1
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (1, 'user1', '2019-1-1 01:00:00'::timestamp, 'EVENT_1');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (3, 'user1', '2019-1-1 01:00:05'::timestamp, 'EVENT_2');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (5, 'user1', '2019-1-1 01:00:10'::timestamp, 'EVENT_3');

INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (2, 'user2', '2019-1-1 01:00:01'::timestamp, 'EVENT_1');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (4, 'user2', '2019-1-1 01:00:06'::timestamp, 'EVENT_2');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (6, 'user2', '2019-1-1 01:00:11'::timestamp, 'EVENT_3');

-- Period 2
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (7, 'user1', '2019-1-1 01:10:00'::timestamp, 'EVENT_1');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (9, 'user1', '2019-1-1 01:10:05'::timestamp, 'EVENT_2');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (11, 'user1', '2019-1-1 01:10:10'::timestamp, 'EVENT_3');

INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (8, 'user2', '2019-1-1 01:10:01'::timestamp, 'EVENT_1');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (10, 'user2', '2019-1-1 01:10:06'::timestamp, 'EVENT_2');
INSERT INTO test."TRACKING_EVENTS" ("ID", "USER_ID", "CREATED_AT", "EVENT_NAME") 
VALUES (12, 'user2', '2019-1-1 01:10:11'::timestamp, 'EVENT_3');

我想这就是你想要的:

select user_id, session,
       array_agg(event_name order by created_at)
from (select tt.*,
             count(*) filter (where prev_ca < created_at - interval '5 minute') over (partition by user_id order by created_at) as session
      from (select tt.*,
                   lag(created_at) over (partition by user_id order by CREATED_AT) as prev_ca
            from test."TRACKING_EVENTS" tt
           ) tt
     ) tt
group by user_id, session
order by user_id, session;

请注意,这使用数组而不是字符串。您使用的是Postgres,因此array\u agg是将多个值组合在一起的好方法。

非常感谢!我不得不稍微调整一下查询,因为上次用户会话中出现了一个问题。它可以将最后两个会话分组在一起,也可以只使用一个事件创建一个额外的会话。这就是我所做的更改:选择tt.*,按用户ID在超订单处创建,从test.TRACKING\u事件中创建,作为上一个catt@FelipeTaiarol . . . 我认为这应该是按用户id顺序划分的。