Sql 按队列每周动态比较
目标:Sql 按队列每周动态比较,sql,sql-server,pivot,partitioning,Sql,Sql Server,Pivot,Partitioning,目标: SELECT CONCAT(DATEPART(YEAR,m.date_created),'-',DATEPART(MONTH,m.date_created)) AS Cohort ,CONCAT(subquery.[YYYY],'-',subquery.[ISO]) AS YYYY_ISO ,m.email FROM member as m INNER JOIN (SELECT DATEPART(YEAR,log.login_tim
SELECT CONCAT(DATEPART(YEAR,m.date_created),'-',DATEPART(MONTH,m.date_created)) AS Cohort
,CONCAT(subquery.[YYYY],'-',subquery.[ISO]) AS YYYY_ISO
,m.email
FROM member as m
INNER JOIN (SELECT DATEPART(YEAR,log.login_time) AS [YYYY]
,DATEPART(ISO_WEEK,log.login_time) AS [ISO]
,log.email
,ROW_NUMBER()
OVER(PARTITION BY
DATEPART(YEAR,log.login_time),
DATEPART(ISO_WEEK,log.login_time),
log.email
ORDER BY log.login_time ASC) AS Log_Rank
FROM login AS log
WHERE CAST(log.login_time AS DATE) >= '2019-01-01'
) AS subquery ON m.email=subquery.email AND Log_Rank = 1
ORDER BY cohort
CREATE TABLE member
([email] varchar(50), [date_created] Datetime)
CREATE TABLE login
([email] varchar(50), [login_time] Datetime)
INSERT INTO member
VALUES
('player123@google.com', '2018-03-01 05:00:00'),
('player999@google.com', '2018-04-12 12:00:00'),
('player555@google.com', '2018-04-25 20:15:00')
INSERT INTO login
VALUES
('player123@google.com', '2019-01-07 05:30:00'),
('player123@google.com', '2019-01-08 08:30:00'),
('player123@google.com', '2019-01-15 06:30:00'),
('player999@google.com', '2019-01-08 11:30:00'),
('player999@google.com', '2019-01-10 07:30:00'),
('player555@google.com', '2019-01-08 04:30:00')
在第1周获得Id登录。那么在第2周有多少ID登录。
在第2周到第3周重新启动相同的逻辑。
然后是第3周和第4周等等。。。这个练习需要每周做一次。
ID需要按照队列进行细分,队列是他们订阅的月份和年份
故事:
SELECT CONCAT(DATEPART(YEAR,m.date_created),'-',DATEPART(MONTH,m.date_created)) AS Cohort
,CONCAT(subquery.[YYYY],'-',subquery.[ISO]) AS YYYY_ISO
,m.email
FROM member as m
INNER JOIN (SELECT DATEPART(YEAR,log.login_time) AS [YYYY]
,DATEPART(ISO_WEEK,log.login_time) AS [ISO]
,log.email
,ROW_NUMBER()
OVER(PARTITION BY
DATEPART(YEAR,log.login_time),
DATEPART(ISO_WEEK,log.login_time),
log.email
ORDER BY log.login_time ASC) AS Log_Rank
FROM login AS log
WHERE CAST(log.login_time AS DATE) >= '2019-01-01'
) AS subquery ON m.email=subquery.email AND Log_Rank = 1
ORDER BY cohort
CREATE TABLE member
([email] varchar(50), [date_created] Datetime)
CREATE TABLE login
([email] varchar(50), [login_time] Datetime)
INSERT INTO member
VALUES
('player123@google.com', '2018-03-01 05:00:00'),
('player999@google.com', '2018-04-12 12:00:00'),
('player555@google.com', '2018-04-25 20:15:00')
INSERT INTO login
VALUES
('player123@google.com', '2019-01-07 05:30:00'),
('player123@google.com', '2019-01-08 08:30:00'),
('player123@google.com', '2019-01-15 06:30:00'),
('player999@google.com', '2019-01-08 11:30:00'),
('player999@google.com', '2019-01-10 07:30:00'),
('player555@google.com', '2019-01-08 04:30:00')
第一个表(成员)包含电子邮件及其创建日期。第二个表(登录表)是登录活动。首先,我需要按创建日期(月-年)对电子邮件进行分组,以创建群组。
然后,对每个队列每周的登录活动进行比较。可能吗
此查询是否每周都是动态的
输出:
结果应该如下所示:
+--------+--------+--------+--------+---------+
| Cohort | 2019-1 | 2019-2 | 2019-3 | 2019-4 |...
+--------+--------+--------+--------+---------+
| 2018-3 | 7000 | 6800 | 7400| 7100 |...
| 2018-4 | 6800 | 6500 | 8400| 8000 |...
| 2018-5 | 9500 | 8000 | 6400| 6200 |...
| 2018-6 | 9100 | 8500 | 8000| 7800 |...
| 2018-7 | 10000 | 8000 | 7000| 6800 |...
+--------+--------+--------+--------+---------+
我的尝试:
SELECT CONCAT(DATEPART(YEAR,m.date_created),'-',DATEPART(MONTH,m.date_created)) AS Cohort
,CONCAT(subquery.[YYYY],'-',subquery.[ISO]) AS YYYY_ISO
,m.email
FROM member as m
INNER JOIN (SELECT DATEPART(YEAR,log.login_time) AS [YYYY]
,DATEPART(ISO_WEEK,log.login_time) AS [ISO]
,log.email
,ROW_NUMBER()
OVER(PARTITION BY
DATEPART(YEAR,log.login_time),
DATEPART(ISO_WEEK,log.login_time),
log.email
ORDER BY log.login_time ASC) AS Log_Rank
FROM login AS log
WHERE CAST(log.login_time AS DATE) >= '2019-01-01'
) AS subquery ON m.email=subquery.email AND Log_Rank = 1
ORDER BY cohort
CREATE TABLE member
([email] varchar(50), [date_created] Datetime)
CREATE TABLE login
([email] varchar(50), [login_time] Datetime)
INSERT INTO member
VALUES
('player123@google.com', '2018-03-01 05:00:00'),
('player999@google.com', '2018-04-12 12:00:00'),
('player555@google.com', '2018-04-25 20:15:00')
INSERT INTO login
VALUES
('player123@google.com', '2019-01-07 05:30:00'),
('player123@google.com', '2019-01-08 08:30:00'),
('player123@google.com', '2019-01-15 06:30:00'),
('player999@google.com', '2019-01-08 11:30:00'),
('player999@google.com', '2019-01-10 07:30:00'),
('player555@google.com', '2019-01-08 04:30:00')
样本数据:
SELECT CONCAT(DATEPART(YEAR,m.date_created),'-',DATEPART(MONTH,m.date_created)) AS Cohort
,CONCAT(subquery.[YYYY],'-',subquery.[ISO]) AS YYYY_ISO
,m.email
FROM member as m
INNER JOIN (SELECT DATEPART(YEAR,log.login_time) AS [YYYY]
,DATEPART(ISO_WEEK,log.login_time) AS [ISO]
,log.email
,ROW_NUMBER()
OVER(PARTITION BY
DATEPART(YEAR,log.login_time),
DATEPART(ISO_WEEK,log.login_time),
log.email
ORDER BY log.login_time ASC) AS Log_Rank
FROM login AS log
WHERE CAST(log.login_time AS DATE) >= '2019-01-01'
) AS subquery ON m.email=subquery.email AND Log_Rank = 1
ORDER BY cohort
CREATE TABLE member
([email] varchar(50), [date_created] Datetime)
CREATE TABLE login
([email] varchar(50), [login_time] Datetime)
INSERT INTO member
VALUES
('player123@google.com', '2018-03-01 05:00:00'),
('player999@google.com', '2018-04-12 12:00:00'),
('player555@google.com', '2018-04-25 20:15:00')
INSERT INTO login
VALUES
('player123@google.com', '2019-01-07 05:30:00'),
('player123@google.com', '2019-01-08 08:30:00'),
('player123@google.com', '2019-01-15 06:30:00'),
('player999@google.com', '2019-01-08 11:30:00'),
('player999@google.com', '2019-01-10 07:30:00'),
('player555@google.com', '2019-01-08 04:30:00')
像
7000
这样的值来自哪里?我对那件事有点困惑。我们是否丢失(大量)数据?随机生成。基本上,如果将第1周与第2周进行比较。第2周不能超过第1周。当第2周与第3周进行比较时,第3周不能超过第2周,以此类推@但是7000
小于8000
?虽然我们有一些非常好的样本数据,但我不知道你的逻辑是什么。@Larnu是的,很好。逻辑如下:如果1000人(2018年3月订阅)在第1周登录,其中700人在第2周登录。这是70%的保留金。我需要了解如何为其他群组(即2018年4月、2018年5月等的订户)做到这一点;一周又一周。输出用作最终产品的视觉表示。看起来应该用支点,但我unsure@RogerSteinberg您好,如果我了解您想要什么,我想您必须使用(pivot、groupby、dynamic exec query和一些agregate函数,如count、)