在postgresql中,如何按顺序计算重复分区中的运行数?
我有一个用户完成了一系列事件。我想记录他们完成每个事件的次数以及顺序 因此,对于下表用户事件: 我应该得到:在postgresql中,如何按顺序计算重复分区中的运行数?,sql,postgresql,sequence,aggregation,gaps-and-islands,Sql,Postgresql,Sequence,Aggregation,Gaps And Islands,我有一个用户完成了一系列事件。我想记录他们完成每个事件的次数以及顺序 因此,对于下表用户事件: 我应该得到: name eventname event_sequence_number time_started frequency Ted a 1 12:01 1 Ted b 2 12:02 3 Ted c 3
name eventname event_sequence_number time_started frequency
Ted a 1 12:01 1
Ted b 2 12:02 3
Ted c 3 12:05 1
Ted b 4 12:06 2
Ted c 5 12:08 1
Ted b 6 12:09 3
我一直在尝试排名,密集排名,行数和滞后,但不能把它们放在一起。有什么想法吗?试试这个。它使用Tabibitosan方法对序列范围进行分组: PostgreSQL 9.6架构设置: 问题1: :
一段漂亮的代码——我最终创造了一些类似的东西。对于通过聚合数百万在线用户行为创建更清晰的路径分析非常有用。
name eventname event_sequence_number time_started frequency
Ted a 1 12:01 1
Ted b 2 12:02 3
Ted c 3 12:05 1
Ted b 4 12:06 2
Ted c 5 12:08 1
Ted b 6 12:09 3
CREATE TABLE user_events
(user_name varchar(3), eventname varchar(1), event_time time)
;
INSERT INTO user_events
(user_name, eventname, event_time)
VALUES
('Ted', 'a', '12:01'),
('Ted', 'b', '12:02'),
('Ted', 'b', '12:03'),
('Ted', 'b', '12:04'),
('Ted', 'c', '12:05'),
('Ted', 'b', '12:06'),
('Ted', 'b', '12:07'),
('Ted', 'c', '12:08'),
('Ted', 'b', '12:09'),
('Ted', 'b', '12:11'),
('Ted', 'b', '12:12')
;
SELECT t.user_name
,t.eventname
,row_number() OVER (
ORDER BY MIN(event_time)
) AS event_sequence_number
,MIN(event_time) AS time_started
,COUNT(*) as frequency
FROM (
SELECT user_name
,eventname
,event_time
,row_number() OVER (
ORDER BY event_time
) - row_number() OVER (
PARTITION BY eventname ORDER BY event_time
,eventname
) seq
FROM user_events
) t
GROUP BY user_name
,eventname
,seq
ORDER BY time_started
| user_name | eventname | event_sequence_number | time_started | frequency |
|-----------|-----------|-----------------------|--------------|-----------|
| Ted | a | 1 | 12:01:00 | 1 |
| Ted | b | 2 | 12:02:00 | 3 |
| Ted | c | 3 | 12:05:00 | 1 |
| Ted | b | 4 | 12:06:00 | 2 |
| Ted | c | 5 | 12:08:00 | 1 |
| Ted | b | 6 | 12:09:00 | 3 |