SQL返回连续记录

SQL返回连续记录,sql,linq,lambda,sql-server-2008-r2,Sql,Linq,Lambda,Sql Server 2008 R2,一个简单的表格: ForumPost -------------- ID (int PK) UserID (int FK) Date (datetime) 我想要返回的是一个特定用户连续n天每天至少发表一篇文章的次数 例如: User 15844 has posted at least 1 post a day for 30 consecutive days 10 times 我已经用linq/lambda标记了这个问题,还有一个解决方案也会很好。我知道我可以通过迭代所有用户记录来解决这个问

一个简单的表格:

ForumPost
--------------
ID (int PK)
UserID (int FK)
Date (datetime)
我想要返回的是一个特定用户连续n天每天至少发表一篇文章的次数

例如:

User 15844 has posted at least 1 post a day for 30 consecutive days 10 times

我已经用linq/lambda标记了这个问题,还有一个解决方案也会很好。我知道我可以通过迭代所有用户记录来解决这个问题,但这很慢。

有一个简便的技巧,你可以使用
行号()
来查找连续的条目,想象以下一组日期,它们的行号(从0开始):

对于连续条目,如果从值中减去行数,则得到相同的结果。e、 g

Date        RowNumber   date - row_number
20130401    0           20130401
20130402    1           20130401
20130403    2           20130401
20130404    3           20130401
20130406    4           20130402
20130407    5           20130402
然后,您可以按
日期-行号
分组,以获得连续天数集(即前4条记录和最后2条记录)

要将此应用于您的示例,您可以使用:

WITH Posts AS
(   SELECT  FirstPost = DATEADD(DAY, 1 - ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY [Date]), [Date]),
            UserID,
            Date
    FROM    (   SELECT  DISTINCT UserID, [Date] = CAST(Date AS [Date])
                FROM    ForumPost
            ) fp
), Posts2 AS
(   SELECT  FirstPost, 
            UserID, 
            Days = COUNT(*), 
            LastDate = MAX(Date)
    FROM    Posts
    GROUP BY FirstPost, UserID
)
SELECT  UserID, ConsecutiveDates = MAX(Days)
FROM    Posts2
GROUP BY UserID;

编辑

我不认为上面的回答完全正确,这将给出一个用户发布的次数或连续n天的次数:

WITH Posts AS
(   SELECT  FirstPost = DATEADD(DAY, 1 - ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY [Date]), [Date]),
            UserID,
            Date
    FROM    (   SELECT  DISTINCT UserID, [Date] = CAST(Date AS [Date])
                FROM    ForumPost
            ) fp
), Posts2 AS
(   SELECT  FirstPost, 
            UserID, 
            Days = COUNT(*), 
            FirstDate = MIN(Date), 
            LastDate = MAX(Date)
    FROM    Posts
    GROUP BY FirstPost, UserID
)
SELECT  UserID, [Times Over N Days] = COUNT(*)
FROM    Posts2
WHERE   Days >= 30
GROUP BY UserID;

我认为,您的特定应用程序使这个过程非常简单。如果在“n”天间隔内有“n”个不同的日期,则这些“n”个不同的日期必须是连续的

滚动至底部,查看只需要通用表表达式并更改为PostgreSQL的通用解决方案。(开玩笑。我是用PostgreSQL实现的,因为我时间不够。)

现在,让我们看看这个查询的输出。为了简单起见,我考虑的是5天的间隔,而不是30天的间隔

select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid;

USERID  DISTINCT_DATES  
1       5
2       5
3       1
对于符合条件的用户,在5天的时间间隔内,不同日期的数量必须是5,对吗?所以我们只需要在HAVING子句中添加这个逻辑

select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid
having count(distinct post_date) = 5;

USERID  DISTINCT_DATES  
1       5
2       5

更通用的解决方案

如果你从2013-01-01到2013-01-31每天都发帖,那么你已经连续发帖30天2次了,这样说是没有意义的。相反,我希望时钟能在2013年1月31日重新开始。我为在PostgreSQL中实现而道歉;稍后我将尝试在T-SQL中实现

with first_posts as (
  select userid, min(post_date) first_post_date
  from forumpost
  group by userid
), 
period_intervals as (
  select userid, first_post_date period_start, 
         (first_post_date + interval '4' day)::date period_end
  from first_posts
), user_specific_intervals as (
  select 
    userid, 
    (period_start + (n || ' days')::interval)::date as period_start, 
    (period_end + (n || ' days')::interval)::date as period_end 
  from period_intervals, generate_series(0, 30, 5) n
)
select userid, period_start, period_end, 
       (select count(distinct post_date) 
        from forumpost
        where forumpost.post_date between period_start and period_end
          and userid = forumpost.userid) distinct_dates
from user_specific_intervals
order by userid, period_start;

您使用的是哪种数据库管理系统?博士后?Oracle?对日期范围为30天前的所有帖子使用子查询,按日期和计数分组。。检查是否30?相关,可能重复:@FuzzyButton-如果他们一天发布30次,而29天没有发布,那么总数仍然是30。这取决于“连续30天10次”的含义。如果你在2013-01-01和2013-01-30之间每天都发帖,那就是连续30天,1次。如果你在2013年1月31日再次发帖,这是连续30天2次吗?
select userid, count(distinct post_date) distinct_dates
from forumpost
where post_date between '2013-01-15' and '2013-01-19'
group by userid
having count(distinct post_date) = 5;

USERID  DISTINCT_DATES  
1       5
2       5
with first_posts as (
  select userid, min(post_date) first_post_date
  from forumpost
  group by userid
), 
period_intervals as (
  select userid, first_post_date period_start, 
         (first_post_date + interval '4' day)::date period_end
  from first_posts
), user_specific_intervals as (
  select 
    userid, 
    (period_start + (n || ' days')::interval)::date as period_start, 
    (period_end + (n || ' days')::interval)::date as period_end 
  from period_intervals, generate_series(0, 30, 5) n
)
select userid, period_start, period_end, 
       (select count(distinct post_date) 
        from forumpost
        where forumpost.post_date between period_start and period_end
          and userid = forumpost.userid) distinct_dates
from user_specific_intervals
order by userid, period_start;