Mysql 为什么评论计数错误?
我需要计算项目的评论和投票,但评论计算错误Mysql 为什么评论计数错误?,mysql,Mysql,我需要计算项目的评论和投票,但评论计算错误 SELECT projects . * , COUNT( votes.project_id ) AS votes, COUNT( comments.user_id) AS comments FROM `projects` LEFT JOIN `votes` ON `projects`.`id` = `votes`.`project_id` LEFT JOIN `comments` ON `projects`
SELECT projects . * , COUNT( votes.project_id ) AS votes, COUNT( comments.user_id) AS comments
FROM `projects`
LEFT JOIN `votes` ON `projects`.`id` = `votes`.`project_id`
LEFT JOIN `comments` ON `projects`.`id` = `comments`.`project_id`
WHERE `votes`.`created_at` > '2014-05-31 20:21:43' AND
GROUP BY `projects`.`id`
ORDER BY `votes` DESC
输出:
您需要计算不同的值,因此有点像:
SELECT projects . * , COUNT( DISTINCT votes.user_id ) AS votes, COUNT( DISTINCT comments.user_id) AS comments
FROM `projects`
LEFT JOIN `votes` ON `projects`.`id` = `votes`.`project_id`
LEFT JOIN `comments` ON `projects`.`id` = `comments`.`project_id`
WHERE `votes`.`created_at` > '2014-05-31 20:21:43' AND
GROUP BY `projects`.`id`
ORDER BY `votes` DESC
计数“错误”,因为count()
聚合正在计算结果集中的行,而不是单个表中的行。如果在comments
表中有两行project_id=1
,则两个计数聚合都将返回8
基本上,注释
中的每一行都与项目
中的每一行相匹配
有几种方法可以解决这个问题。一种是在选择列表中使用subselect,尽管对于大型集合来说这可能会很昂贵(性能方面):
SELECT p.*
, ( SELECT COUNT(1)
FROM votes v
WHERE v.project_id = p.project_id
AND v.created_at > '2014-05-31 20:21:43'
) AS votes
, ( SELECT COUNT(1)
FROM comments c
WHERE c.project_id = p.project_id
) AS comment_cnt
FROM projects p
HAVING votes > 0
ORDER BY votes DESC
(包含HAVING子句是为了模拟原始查询;在原始查询中,WHERE子句中的vows.created_at
上的谓词否定投票表左连接的“outerness”。)
另一种方法是分别从每个表中获取计数(在两个单独的查询中按project_id获取计数,作为内联视图引用,然后将这些计数与联接操作相结合。例如:
SELECT p.*
, w.votes
, IFNULL(d.comment_cnt,0) AS comment_cnt
FROM projects p
JOIN ( SELECT v.project_id
, COUNT(1) AS votes
FROM votes v
WHERE v.created_at > '2014-05-31 20:21:43'
GROUP BY v.project_id
) w
ON w.project_id = p.project_id
LEFT
JOIN ( SELECT c.project_id
, COUNT(1) AS comment_cnt
FROM comments c
GROUP BY c.project_id
) d
ON d.project_id = p.project_id
ORDER BY w.votes DESC
(因为该规范只返回“投票”计数大于零的行,所以我们可以使用内部联接排除没有任何“投票”的行。对于注释计数,我们使用外部联接,并简单地将任何空值替换为零
还有其他方法
性能将取决于行数、引用列的基数、可用索引、优化器选择的执行计划等