sql中的连续天数_Sql_Postgresql

sql中的连续天数

sql postgresql

sql中的连续天数,sql,postgresql,Sql,Postgresql,我发现很多关于连续几天的stackoverflow QNA。答案仍然太短，我无法理解到底发生了什么为了具体起见，我将制作一个模型或一张桌子我正在使用postgresql，如果它能起作用的话 CREATE TABLE work ( id integer NOT NULL, user_id integer NOT NULL, arrived_at timestamp with time zone NOT NULL ); insert into work(user_i

我发现很多关于连续几天的stackoverflow QNA。答案仍然太短，我无法理解到底发生了什么

为了具体起见，我将制作一个模型或一张桌子我正在使用postgresql，如果它能起作用的话

CREATE TABLE work (
    id integer NOT NULL,
    user_id integer NOT NULL,
    arrived_at timestamp with time zone NOT NULL
);


insert into work(user_id, arrived_at) values(1, '01/03/2011');
insert into work(user_id, arrived_at) values(1, '01/04/2011');

以给定用户的最简单形式，我希望找到最后一个连续的日期范围

对于给定的用户，我的最终目标是找到他连续的工作日。如果他昨天来上班，到今天为止，他还有机会连续工作几天。所以我给他看了昨天之前的连续几天。但如果他错过了昨天的比赛，那么他的连续几天要么是0天，要么是1天，这取决于他今天来不来

假设今天是第八天

3 * 5 6 7 * = 3 days (5 to 7)
3 * 5 6 7 8 = 4 days (5 to 8)
3 4 5 * 7 * = 1 day (7 to 7)
3 * * * * * = 0 day 
3 * * * * 8 = 1 day (8 to 8)

下面是我使用CTE解决这个问题的方法

检查以下位置的代码：

以下是查询的工作方式：

它从考勤表中选择今天的记录。如果今天的记录不可用，则选择昨天的记录然后，它在最短日期的前一天不断递归地添加记录如果要选择最新的连续日期范围，而不考虑用户的最新出席日期是什么时候、昨天还是x天前，则CTE的初始化部分必须替换为以下代码段：

SELECT MAX(attendanceDate) FROM attendance

[编辑] 下面是SQL FIDLE的查询，它解决了您的问题1：

结果:

CREATE TABLE
INSERT 0 14
 user_id | first_day  |  last_day  | nday 
---------+------------+------------+------
       1 | 2014-02-05 | 2014-02-07 |    3
       2 | 2014-02-05 | 2014-02-08 |    4
(2 rows)

可以使用以下范围类型创建聚合：

Create function sfunc (tstzrange, timestamptz)
    returns tstzrange
    language sql strict as $$
        select case when $2 - upper($1) <= '1 day'::interval
                then tstzrange(lower($1), $2, '[]')
                else tstzrange($2, $2, '[]') end
    $$;

Create aggregate consecutive (timestamptz) (
        sfunc = sfunc,
        stype = tstzrange,
        initcond = '[,]'
);

在窗口函数中使用聚合：

Select *,
        consecutive(arrived_at)
                over (partition by user_id order by arrived_at)
    from work;

    ┌────┬─────────┬────────────────────────┬─────────────────────────────────────────────────────┐
    │ id │ user_id │       arrived_at       │                     consecutive                     │
    ├────┼─────────┼────────────────────────┼─────────────────────────────────────────────────────┤
    │  1 │       1 │ 2011-01-03 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-03 00:00:00+02"] │
    │  2 │       1 │ 2011-01-04 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-04 00:00:00+02"] │
    │  3 │       1 │ 2011-01-05 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │
    │  4 │       2 │ 2011-01-06 00:00:00+02 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │
    └────┴─────────┴────────────────────────┴─────────────────────────────────────────────────────┘

查询结果以查找所需内容：

With work_detail as (select *,
            consecutive(arrived_at)
                    over (partition by user_id order by arrived_at)
        from work)
    select arrived_at, upper(consecutive) - lower(consecutive) as days
        from work_detail
            where user_id = 1 and upper(consecutive) != lower(consecutive)
            order by arrived_at desc
                limit 1;

    ┌────────────────────────┬────────┐
    │       arrived_at       │  days  │
    ├────────────────────────┼────────┤
    │ 2011-01-05 00:00:00+02 │ 2 days │
    └────────────────────────┴────────┘

您甚至可以在不使用递归CTE的情况下执行此操作：使用generate_series、LEFT JOIN、row_count和最终限制1：

1今天加上截至昨天的连续几天：

SELECT count(*)   -- 1 / 0  for "today"
     + COALESCE(( -- + optional count of consecutive days up until "yesterday"
       SELECT ct
       FROM  (
          SELECT d.ct, count(w.arrived_at) OVER (ORDER BY d.ct) AS day_ct
          FROM   generate_series(1, 8) AS d(ct)   -- maximum = 8
          LEFT   JOIN work w ON  w.arrived_at >= current_date -  d.ct
                             AND w.arrived_at <  current_date - (d.ct - 1)
                             AND w.user_id = 1    -- given user
          ) sub
       WHERE  ct = day_ct
       ORDER  BY ct DESC
       LIMIT  1
       ), 0) AS total
FROM   work
WHERE  arrived_at >= current_date  -- no future timestamps
AND    user_id = 1                 -- given user

有趣的问题…你们能添加表的模式吗？模式和样本数据作为创建表和插入以及期望的结果。请添加真实的DDL+样本数据。请不要用速记法。你能给我原来的小提琴吗？它似乎解决了我的第一个问题？没有今天/昨天的考虑，以便我可以首先了解您查询的基本内容？如果用户每天可以参加一次以上，仍然将其视为一天的出席人数，是否需要完全重写您的代码？否，我想我们只需要从attendanceDate中提取日期部分，无论它在查询中使用在哪里，rest都应该工作正常。这比CTE解决方案快吗？@eugene:可能是的。考虑简化的更新。你能在你的数据上运行EXPLAIN ANALYSE和EXPLAIN ANALYSE吗？我还没有足够大的数据集。我花了很长时间才把答案转换成我的实际模式

Select *,
        consecutive(arrived_at)
                over (partition by user_id order by arrived_at)
    from work;

    ┌────┬─────────┬────────────────────────┬─────────────────────────────────────────────────────┐
    │ id │ user_id │       arrived_at       │                     consecutive                     │
    ├────┼─────────┼────────────────────────┼─────────────────────────────────────────────────────┤
    │  1 │       1 │ 2011-01-03 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-03 00:00:00+02"] │
    │  2 │       1 │ 2011-01-04 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-04 00:00:00+02"] │
    │  3 │       1 │ 2011-01-05 00:00:00+02 │ ["2011-01-03 00:00:00+02","2011-01-05 00:00:00+02"] │
    │  4 │       2 │ 2011-01-06 00:00:00+02 │ ["2011-01-06 00:00:00+02","2011-01-06 00:00:00+02"] │
    └────┴─────────┴────────────────────────┴─────────────────────────────────────────────────────┘

With work_detail as (select *,
            consecutive(arrived_at)
                    over (partition by user_id order by arrived_at)
        from work)
    select arrived_at, upper(consecutive) - lower(consecutive) as days
        from work_detail
            where user_id = 1 and upper(consecutive) != lower(consecutive)
            order by arrived_at desc
                limit 1;

    ┌────────────────────────┬────────┐
    │       arrived_at       │  days  │
    ├────────────────────────┼────────┤
    │ 2011-01-05 00:00:00+02 │ 2 days │
    └────────────────────────┴────────┘

SELECT count(*)   -- 1 / 0  for "today"
     + COALESCE(( -- + optional count of consecutive days up until "yesterday"
       SELECT ct
       FROM  (
          SELECT d.ct, count(w.arrived_at) OVER (ORDER BY d.ct) AS day_ct
          FROM   generate_series(1, 8) AS d(ct)   -- maximum = 8
          LEFT   JOIN work w ON  w.arrived_at >= current_date -  d.ct
                             AND w.arrived_at <  current_date - (d.ct - 1)
                             AND w.user_id = 1    -- given user
          ) sub
       WHERE  ct = day_ct
       ORDER  BY ct DESC
       LIMIT  1
       ), 0) AS total
FROM   work
WHERE  arrived_at >= current_date  -- no future timestamps
AND    user_id = 1                 -- given user

CREATE INDEX foo_idx ON work (user_id,arrived_at);