SQL返回连续记录
一个简单的表格:SQL返回连续记录,sql,linq,lambda,sql-server-2008-r2,Sql,Linq,Lambda,Sql Server 2008 R2,一个简单的表格: ForumPost -------------- ID (int PK) UserID (int FK) Date (datetime) 我想要返回的是一个特定用户连续n天每天至少发表一篇文章的次数 例如: User 15844 has posted at least 1 post a day for 30 consecutive days 10 times 我已经用linq/lambda标记了这个问题,还有一个解决方案也会很好。我知道我可以通过迭代所有用户记录来解决这个问
ForumPost
--------------
ID (int PK)
UserID (int FK)
Date (datetime)
我想要返回的是一个特定用户连续n天每天至少发表一篇文章的次数
例如:
User 15844 has posted at least 1 post a day for 30 consecutive days 10 times
我已经用linq/lambda标记了这个问题,还有一个解决方案也会很好。我知道我可以通过迭代所有用户记录来解决这个问题,但这很慢。有一个简便的技巧,你可以使用
行号()
来查找连续的条目,想象以下一组日期,它们的行号(从0开始):
对于连续条目,如果从值中减去行数,则得到相同的结果。e、 g
Date RowNumber date - row_number
20130401 0 20130401
20130402 1 20130401
20130403 2 20130401
20130404 3 20130401
20130406 4 20130402
20130407 5 20130402
然后,您可以按日期-行号
分组,以获得连续天数集(即前4条记录和最后2条记录)
要将此应用于您的示例,您可以使用:
WITH Posts AS
( SELECT FirstPost = DATEADD(DAY, 1 - ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY [Date]), [Date]),
UserID,
Date
FROM ( SELECT DISTINCT UserID, [Date] = CAST(Date AS [Date])
FROM ForumPost
) fp
), Posts2 AS
( SELECT FirstPost,
UserID,
Days = COUNT(*),
LastDate = MAX(Date)
FROM Posts
GROUP BY FirstPost, UserID
)
SELECT UserID, ConsecutiveDates = MAX(Days)
FROM Posts2
GROUP BY UserID;
编辑
我不认为上面的回答完全正确,这将给出一个用户发布的次数或连续n天的次数:
WITH Posts AS
( SELECT FirstPost = DATEADD(DAY, 1 - ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY [Date]), [Date]),
UserID,
Date
FROM ( SELECT DISTINCT UserID, [Date] = CAST(Date AS [Date])
FROM ForumPost
) fp
), Posts2 AS
( SELECT FirstPost,
UserID,
Days = COUNT(*),
FirstDate = MIN(Date),
LastDate = MAX(Date)
FROM Posts
GROUP BY FirstPost, UserID
)
SELECT UserID, [Times Over N Days] = COUNT(*)
FROM Posts2
WHERE Days >= 30
GROUP BY UserID;
我认为,您的特定应用程序使这个过程非常简单。如果在“n”天间隔内有“n”个不同的日期,则这些“n”个不同的日期必须是连续的 滚动至底部,查看只需要通用表表达式并更改为PostgreSQL的通用解决方案。(开玩笑。我是用PostgreSQL实现的,因为我时间不够。) 现在,让我们看看这个查询的输出。为了简单起见,我考虑的是5天的间隔,而不是30天的间隔
select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid;
USERID DISTINCT_DATES
1 5
2 5
3 1
对于符合条件的用户,在5天的时间间隔内,不同日期的数量必须是5,对吗?所以我们只需要在HAVING子句中添加这个逻辑
select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid
having count(distinct post_date) = 5;
USERID DISTINCT_DATES
1 5
2 5
更通用的解决方案 如果你从2013-01-01到2013-01-31每天都发帖,那么你已经连续发帖30天2次了,这样说是没有意义的。相反,我希望时钟能在2013年1月31日重新开始。我为在PostgreSQL中实现而道歉;稍后我将尝试在T-SQL中实现
with first_posts as (
select userid, min(post_date) first_post_date
from forumpost
group by userid
),
period_intervals as (
select userid, first_post_date period_start,
(first_post_date + interval '4' day)::date period_end
from first_posts
), user_specific_intervals as (
select
userid,
(period_start + (n || ' days')::interval)::date as period_start,
(period_end + (n || ' days')::interval)::date as period_end
from period_intervals, generate_series(0, 30, 5) n
)
select userid, period_start, period_end,
(select count(distinct post_date)
from forumpost
where forumpost.post_date between period_start and period_end
and userid = forumpost.userid) distinct_dates
from user_specific_intervals
order by userid, period_start;
您使用的是哪种数据库管理系统?博士后?Oracle?对日期范围为30天前的所有帖子使用子查询,按日期和计数分组。。检查是否30?相关,可能重复:@FuzzyButton-如果他们一天发布30次,而29天没有发布,那么总数仍然是30。这取决于“连续30天10次”的含义。如果你在2013-01-01和2013-01-30之间每天都发帖,那就是连续30天,1次。如果你在2013年1月31日再次发帖,这是连续30天2次吗?
select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid
having count(distinct post_date) = 5;
USERID DISTINCT_DATES
1 5
2 5
with first_posts as (
select userid, min(post_date) first_post_date
from forumpost
group by userid
),
period_intervals as (
select userid, first_post_date period_start,
(first_post_date + interval '4' day)::date period_end
from first_posts
), user_specific_intervals as (
select
userid,
(period_start + (n || ' days')::interval)::date as period_start,
(period_end + (n || ' days')::interval)::date as period_end
from period_intervals, generate_series(0, 30, 5) n
)
select userid, period_start, period_end,
(select count(distinct post_date)
from forumpost
where forumpost.post_date between period_start and period_end
and userid = forumpost.userid) distinct_dates
from user_specific_intervals
order by userid, period_start;