很难让MySQL找到一个在一年内活跃但在下一年内不活跃的用户
我有一个MySQL表:很难让MySQL找到一个在一年内活跃但在下一年内不活跃的用户,mysql,Mysql,我有一个MySQL表: CREATE TABLE IF NOT EXISTS users_data ( userid int(11) NOT NULL, computer varchar(30) DEFAULT NULL, logondate date NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=utf8; 这是一个大表,有大约400个独立用户和20台计算机,以及大约20000条来自5年用户登录计算机的条目 我想创建一个汇总表,列出每台特定计算
CREATE TABLE IF NOT EXISTS users_data (
userid int(11) NOT NULL,
computer varchar(30) DEFAULT NULL,
logondate date NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
这是一个大表,有大约400个独立用户和20台计算机,以及大约20000条来自5年用户登录计算机的条目
我想创建一个汇总表,列出每台特定计算机每年的唯一用户数,以及这些用户中有多少是新用户(即,在该年之前没有登录到任何计算机的实例,以及将来没有登录到任何计算机的实例的用户):
CREATE TABLE IF NOT EXISTS summary_computer_use (
computer varchar(30) DEFAULT NULL,
year_used date NOT NULL,
number_of_users int(11) NOT NULL,
number_of_new_users int(11) NOT NULL,
number_of_terminated_users int(11) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
INSERT into summary_computer_use (computer, year_used)
select computer, distinct year(logondate) from users_data;
我每年可以获得唯一的用户:
UPDATE summary_computer_use as a
inner join (
select computer, year(logondate) as year_used,
count(distinct userid) as number_of_users
from users_data
group by computer, year(logondate)
) as b on a.computer = b.computer and
a.year_used = b.year_used
set a.number_of_users = b.number_of_users;
但如何编写select语句来查找给定年份中首次使用计算机(没有早于给定年份的登录日期)或从不再次登录的用户数,这让我感到困惑
有什么建议吗?这就是你想要的:
select y, count(userid) as newusers from
(
select userid, min(year(logondate)) as y from users_data group by userid
) tmp
group by y;
我认为这会产生您想要的摘要:
SELECT computers.computer,
timespan.yyyy AS "year_used",
COALESCE(allusers.num, 0) AS "number_of_users",
COALESCE(newusers.num, 0) AS "number_of_new_users",
COALESCE(terminations.num, 0) AS "number_of_terminated_users"
FROM (SELECT DISTINCT computer
FROM users_data) computers
JOIN (SELECT (2000+i) AS yyyy
FROM integers
WHERE i BETWEEN 0 AND 10) timespan
LEFT JOIN ( SELECT YEAR(logondate) AS logonyear,
computer,
COUNT(DISTINCT userid) AS "num"
FROM users_data
GROUP BY 1, 2) allusers
ON timespan.yyyy = allusers.logonyear AND computers.computer = allusers.computer
LEFT JOIN ( SELECT last_logon AS logonyear,
computer,
COUNT(DISTINCT userid) AS "num"
FROM ( SELECT computer,
userid,
YEAR(MAX(logondate)) AS "last_logon"
FROM users_data
GROUP BY 1, 2) last_user_logons
GROUP BY 1, 2) terminations
ON timespan.yyyy = terminations.logonyear AND computers.computer = terminations.computer
LEFT JOIN ( SELECT first_logon AS logonyear,
computer,
COUNT(DISTINCT userid) AS "num"
FROM ( SELECT computer,
userid,
YEAR(MIN(logondate)) AS "first_logon"
FROM users_data
GROUP BY 1, 2) first_user_logons
GROUP BY 1, 2) newusers
ON timespan.yyyy = newusers.logonyear AND computers.computer = newusers.computer;
这些不同的子查询表示:
- 一组不同的
计算机
- 我们感兴趣的年份的
时间跨度
- 注意:使用
- 注:我们不包括去年(撰写本文时为2011年),因为在本年度完成之前,我们无法“结清”去年终止的账目
- 按计算机划分的每年不同用户数(
)alluser
- 计算机每年新增的
用户数
(构建在计算机上用户的所有
首次登录
记录之上)
- 计算机按年份终止的
数量
(构建在所有
记录之上)上次登录
MIN(YEAR(logondate))
应为年(分钟(logondate))
以利用logondate
上的假定索引。