Sql 获取最长用户条纹的正确计数
我很难获得最长用户连胜的正确计数。连续天数是指每个用户都有签入的连续天数 任何帮助都将不胜感激。下面是我的脚本和示例数据: 签入表:Sql 获取最长用户条纹的正确计数,sql,postgresql,Sql,Postgresql,我很难获得最长用户连胜的正确计数。连续天数是指每个用户都有签入的连续天数 任何帮助都将不胜感激。下面是我的脚本和示例数据: 签入表: user_id goal_id check_in_date ------------------------------------------ | colt | 40365fa0 | 2019-01-07 15:35:53 | colt | d31efe70 | 2019-01-11 15:35:52 | berry| be2fcd50 | 2
user_id goal_id check_in_date
------------------------------------------
| colt | 40365fa0 | 2019-01-07 15:35:53
| colt | d31efe70 | 2019-01-11 15:35:52
| berry| be2fcd50 | 2019-01-12 15:35:51
| colt | e754d050 | 2019-01-13 15:17:16
| colt | 9c87a7f0 | 2019-01-14 15:35:54
| colt | ucgtdes0 | 2019-01-15 12:30:59
WITH dates(DATE) AS
(SELECT DISTINCT Cast(check_in_date AS DATE),
user_id
FROM check_ins),
GROUPS AS
(SELECT Row_number() OVER (
ORDER BY DATE) AS rn, DATE - (Row_number() OVER (ORDER BY DATE) * interval '1' DAY) AS grp, DATE, user_id
FROM dates)
SELECT Count(*) AS streak,
user_id
FROM GROUPS
GROUP BY grp,
user_id
ORDER BY 1 DESC;
PostgreSQL脚本:
user_id goal_id check_in_date
------------------------------------------
| colt | 40365fa0 | 2019-01-07 15:35:53
| colt | d31efe70 | 2019-01-11 15:35:52
| berry| be2fcd50 | 2019-01-12 15:35:51
| colt | e754d050 | 2019-01-13 15:17:16
| colt | 9c87a7f0 | 2019-01-14 15:35:54
| colt | ucgtdes0 | 2019-01-15 12:30:59
WITH dates(DATE) AS
(SELECT DISTINCT Cast(check_in_date AS DATE),
user_id
FROM check_ins),
GROUPS AS
(SELECT Row_number() OVER (
ORDER BY DATE) AS rn, DATE - (Row_number() OVER (ORDER BY DATE) * interval '1' DAY) AS grp, DATE, user_id
FROM dates)
SELECT Count(*) AS streak,
user_id
FROM GROUPS
GROUP BY grp,
user_id
ORDER BY 1 DESC;
以下是我运行上述代码时得到的结果:
streak user_id
--------------
4 colt
1 colt
1 berry
它应该是什么。我还想为每个用户获得最长的连胜记录
streak user_id
--------------
3 colt
1 berry
首先,感谢fiddle脚本和示例数据 您没有使用正确的
行编号
来解决间隙和孤岛问题。它应该类似于下面的数据集查询。除此之外,要获得条纹最高的条纹,您需要在按组号分组后使用DISTINCT On
(grp
,在您的查询中,我称之为seq
)
我希望您希望每天只看到用户数据的不同条目。我试图通过with子句中的细微变化来反映这一点
SELECT * FROM (
WITH check_ins_dt AS
( SELECT DISTINCT check_in_date::DATE as check_in_date,
user_id
FROM check_ins)
SELECT DISTINCT ON (user_id) COUNT(*) AS streak,user_id
FROM (
SELECT c.*,
ROW_NUMBER() OVER(
ORDER BY check_in_date
) - ROW_NUMBER() OVER(
PARTITION BY user_id
ORDER BY check_in_date
) AS seq
FROM check_ins_dt c
) s
GROUP BY user_id,
seq
ORDER BY user_id,
COUNT(*) DESC ) q order
by streak desc;
在Postgres中,您可以这样写:
select distinct on (user_id) user_id, count(distinct check_in_date::date) as num_days
from (select ci.*,
dense_rank() over (partition by user_id order by check_in_date::date) as seq
from check_ins ci
) ci
group by user_id, check_in_date::date - seq * interval '1 day'
order by user_id, num_days desc;
他是一把小提琴
这与您的方法遵循类似的逻辑,但您的查询似乎过于复杂。这确实使用了Postgres
distinct on
功能,这很方便避免额外的子查询。mysql还是postgresql?用你正在使用的数据库解决你的问题。嘿,戈登!这看起来很好,很紧凑,而且很有效。唯一让我感到困惑的是按天数降序排列结果。@twobergs。为此,您需要一个子查询。感谢您的解决方案!