Mysql 查询返回重复的值

Mysql 查询返回重复的值,mysql,sql,Mysql,Sql,下面的查询显示了一些重复和错误的值: SELECT c.contest_id, c.hacker_id, c.name, SUM(s.total_submissions) as total_submissions, SUM(s.total_accepted_submissions) as total_accepted_submissions, SUM(v.total_views) as total_views, SUM(v.total_unique_views) as total_unique_

下面的查询显示了一些重复和错误的值:

SELECT c.contest_id, c.hacker_id, c.name,
SUM(s.total_submissions) as total_submissions,
SUM(s.total_accepted_submissions) as total_accepted_submissions,
SUM(v.total_views) as total_views,
SUM(v.total_unique_views) as total_unique_views
FROM concursos c
JOIN faculdades f ON f.contest_id = c.contest_id
JOIN desafios d ON d.college_id = f.college_id
LEFT JOIN view_stats v ON v.challenge_id = d.challenge_id
LEFT JOIN submission_stats s ON s.challenge_id = d.challenge_id
GROUP BY c.contest_id;
输出应如下所示:

contest_id | hacker_id | name   | total_submissions | total_accepted_submissions | total_views | total_unique_views |
+------------+-----------+--------+-------------------+----------------------------+-------------+--------------------+
|      66406 |     17973 | Rose   |               111 |                         39 |         156 |                56 |
|      66556 |     79153 | Angela |                 0 |                          0 |          11 |                 10 |
|      94828 |     80275 | Frank  |               150 |                         38 |          41 |                15
contest_id | hacker_id | name   | total_submissions | total_accepted_submissions | total_views | total_unique_views |
+------------+-----------+--------+-------------------+----------------------------+-------------+--------------------+
|      66406 |     17973 | Rose   |               222 |                         78 |         238 |                122 |
|      66556 |     79153 | Angela |              NULL |                       NULL |          11 |                 10 |
|      94828 |     80275 | Frank  |               150 |                         38 |          82 |                 30
但结果是这样的:

contest_id | hacker_id | name   | total_submissions | total_accepted_submissions | total_views | total_unique_views |
+------------+-----------+--------+-------------------+----------------------------+-------------+--------------------+
|      66406 |     17973 | Rose   |               111 |                         39 |         156 |                56 |
|      66556 |     79153 | Angela |                 0 |                          0 |          11 |                 10 |
|      94828 |     80275 | Frank  |               150 |                         38 |          41 |                15
contest_id | hacker_id | name   | total_submissions | total_accepted_submissions | total_views | total_unique_views |
+------------+-----------+--------+-------------------+----------------------------+-------------+--------------------+
|      66406 |     17973 | Rose   |               222 |                         78 |         238 |                122 |
|      66556 |     79153 | Angela |              NULL |                       NULL |          11 |                 10 |
|      94828 |     80275 | Frank  |               150 |                         38 |          82 |                 30
表架构:

发生了什么 问题是查看统计和提交统计每个挑战id有多行

查询中的联接发生在GROUP BY和SUM之前。想象一下,没有GROUPBY和SUM的查询结果集

一个简化的例子是:

ids table:

id
--
 1


x table:

id|vx
------
 1|11
 1|22


y table:

id|vy
------
 1| 1

结果

SELECT ids.id, x.vx, y.vy
FROM ids
LEFT JOIN x on x.id = ids.id
LEFT JOIN y on y.id = ids.id;
会是

| id  | vx  | vy  |
| --- | --- | --- |
| 1   | 11  | 1   |
| 1   | 22  | 1   |
请注意vy列中的重复1,尽管在原始y表中只有一行。发生这种情况的原因是,对于id=1,表x中有两行。首先连接这些行,从而也复制ids表的行。然后y被连接到这些已经复制的行,这也复制了y的行。当总结和分组时,我们得到的结果是

| id  | SUM(vy) |
| --- | ------- |
| 1   |       2 |
您可以找到一个带有简化示例的dbfiddle来进行处理

解决方案 有多种方法可以解决这个问题。最直观的方法是在加入视图统计数据行和提交统计数据行之前对它们进行分组和求和

SELECT c.contest_id, c.hacker_id, c.name,
   SUM(s.total_submissions) as total_submissions,
   SUM(s.total_accepted_submissions) as total_accepted_submissions,
   SUM(v.total_views) as total_views,
   SUM(v.total_unique_views) as total_unique_views
FROM concursos c
JOIN faculdades f ON f.contest_id = c.contest_id
JOIN desafios d ON d.college_id = f.college_id
LEFT JOIN (
   SELECT 
      challenge_id, 
      SUM(total_views) as total_views,
      SUM(total_unique_views) as total_unique_views
   FROM view_stats
   GROUP BY challenge_id
) v ON v.challenge_id = d.challenge_id
LEFT JOIN (
   SELECT 
      challenge_id,
      SUM(total_submissions) as total_submissions,
      SUM(total_accepted_submissions) as total_accepted_submissions
   FROM submission_stats
   GROUP BY challenge_id
) s ON s.challenge_id = d.challenge_id
GROUP BY c.contest_id
# to output only rows with non zero sums
HAVING
   IFNULL(SUM(s.total_submissions), 0) <> 0
   OR IFNULL(SUM(s.total_accepted_submissions), 0) <> 0
   OR IFNULL(SUM(v.total_views), 0) <> 0
   OR IFNULL(SUM(v.total_unique_views), 0) <> 0;

嗨,安德烈-谢谢你的提问!您能否在或类似的网站上提供mimumum可复制的示例?-它可以更快地得到答案。此外,建议您不要在问题的图片上提供任何详细信息。最佳做法是以文本格式共享问题的详细信息。请您提供与竞赛id=66406和黑客id=17973相关的数据。您的屏幕截图不包含与此相关的完整数据。还有一件事,由于提交统计数据的一对多关系,查看统计数据的数据会重复,反之亦然。谢谢你,克里斯蒂安,我理解你的推理。但是上面的代码仍然不起作用,我无法识别错误。下面是您给出的错误:错误1064 42000:您的SQL语法有错误;检查与您的MySQL服务器版本相对应的手册,以获得正确的语法,以便在第行的“s.challenge_id=d.challen”上使用“FROM submission_stats GROUP BY challenge_id s”22@AndrAlexandreStella什么不起作用?执行代码时是否有错误,或者是否得到错误的结果?是的,一行中的逗号太多。我修复了答案中的代码。执行代码时出错。谢谢你拉小提琴!我已经在那里验证过了,移除后,它就可以工作了。