Mysql 查询耗时很长(包括解释)

Mysql 查询耗时很长(包括解释),mysql,performance,Mysql,Performance,查询目标: 按地区显示比赛情况 查询: SELECT school_data_schools_outer.district_id, school_data_race_ethnicity_raw_outer.year, school_data_race_ethnicity_raw_outer.race, ROUND( SUM( school_data_race_ethnicity_raw_outer.count) /

查询目标:

按地区显示比赛情况

查询:

SELECT school_data_schools_outer.district_id, 
       school_data_race_ethnicity_raw_outer.year,  
       school_data_race_ethnicity_raw_outer.race,
       ROUND( 
           SUM( school_data_race_ethnicity_raw_outer.count) /
                (SELECT SUM(count)
                   FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_inner
             INNER JOIN school_data_schools as school_data_schools_inner 
                  USING (school_id)
                  WHERE school_data_schools_outer.district_id = school_data_schools_inner.district_id 
                    AND school_data_race_ethnicity_raw_outer.year = school_data_race_ethnicity_raw_inner.year) * 100, 2)
      FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_outer
INNER JOIN school_data_schools as school_data_schools_outer USING (school_id)
  GROUP BY school_data_schools_outer.district_id, 
           school_data_race_ethnicity_raw_outer.year, 
           school_data_race_ethnicity_raw_outer.race

mysql> explain SELECT school_data_schools_outer.district_id, school_data_race_ethnicity_raw_outer.year, school_data_race_ethnicity_raw_outer.race,ROUND(SUM(school_data_race_ethnicity_raw_outer.count)/( SELECT SUM(count) FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_inner INNER JOIN school_data_schools as school_data_schools_inner USING (school_id) WHERE school_data_schools_outer.district_id = school_data_schools_inner.district_id and school_data_race_ethnicity_raw_outer.year = school_data_race_ethnicity_raw_inner.year ) * 100,2) FROM school_data_race_ethnicity_raw as school_data_race_ethnicity_raw_outer INNER JOIN school_data_schools as school_data_schools_outer USING (school_id) GROUP BY school_data_schools_outer.district_id, school_data_race_ethnicity_raw_outer.year, school_data_race_ethnicity_raw_outer.race;
+----+--------------------+--------------------------------------+--------+----------------------------+---------+---------+----------------------------------------------------------------------+-------+---------------------------------+
| id | select_type        | table                                | type   | possible_keys              | key     | key_len | ref                                                                  | rows  | Extra                           |
+----+--------------------+--------------------------------------+--------+----------------------------+---------+---------+----------------------------------------------------------------------+-------+---------------------------------+
|  1 | PRIMARY            | school_data_race_ethnicity_raw_outer | ALL    | school_id,school_id_2      | NULL    | NULL    | NULL                                                                 | 84012 | Using temporary; Using filesort |
|  1 | PRIMARY            | school_data_schools_outer            | eq_ref | PRIMARY                    | PRIMARY | 257     | rocdocs_main_drupal_7.school_data_race_ethnicity_raw_outer.school_id |     1 |                                 |
|  2 | DEPENDENT SUBQUERY | school_data_race_ethnicity_raw_inner | ref    | school_id,year,school_id_2 | year    | 4       | func                                                                 |  8402 |                                 |
|  2 | DEPENDENT SUBQUERY | school_data_schools_inner            | eq_ref | PRIMARY                    | PRIMARY | 257     | rocdocs_main_drupal_7.school_data_race_ethnicity_raw_inner.school_id |     1 | Using where                     |
+----+--------------------+--------------------------------------+--------+----------------------------+---------+---------+----------------------------------------------------------------------+-------+---------------------------------+
4 rows in set (0.00 sec)

mysql>

mysql> describe school_data_race_ethnicity_raw;
+-----------+--------------+------+-----+---------+----------------+
| Field     | Type         | Null | Key | Default | Extra          |
+-----------+--------------+------+-----+---------+----------------+
| id        | int(11)      | NO   | PRI | NULL    | auto_increment |
| school_id | varchar(255) | NO   | MUL | NULL    |                |
| year      | int(11)      | NO   | MUL | NULL    |                |
| race      | varchar(255) | NO   |     | NULL    |                |
| count     | int(11)      | NO   |     | NULL    |                |
+-----------+--------------+------+-----+---------+----------------+
5 rows in set (0.00 sec)

mysql> describe school_data_schools;
+-------------+----------------+------+-----+---------+-------+
| Field       | Type           | Null | Key | Default | Extra |
+-------------+----------------+------+-----+---------+-------+
| school_id   | varchar(255)   | NO   | PRI | NULL    |       |
| grade_level | varchar(255)   | NO   |     | NULL    |       |
| district_id | varchar(255)   | NO   |     | NULL    |       |
| school_name | varchar(255)   | NO   |     | NULL    |       |
| address     | varchar(255)   | NO   |     | NULL    |       |
| city        | varchar(255)   | NO   |     | NULL    |       |
| lat         | decimal(20,10) | NO   |     | NULL    |       |
| lon         | decimal(20,10) | NO   |     | NULL    |       |
+-------------+----------------+------+-----+---------+-------+
8 rows in set (0.00 sec)
注意:我也尝试过:

select sds.school_id, 
  detail.year, 
  detail.race,
  ROUND((detail.count / summary.total) * 100 ,2) as percent 
FROM school_data_race_ethnicity_raw as detail
inner join school_data_schools as sds USING (school_id)
inner join (
  select sds2.district_id, year, sum(count) as total
  from school_data_race_ethnicity_raw
  inner join school_data_schools as sds2 USING (school_id)
  group by sds2.district_id, year
  ) as summary on summary.district_id = sds.district_id 
    and summary.year = detail.year

这应该是所有你需要聚集你的数据,以获得一个种族计数按地区,不知道为什么你在你的原始做这么多的数学,因为它是不必要的,以实现你的目标,并迫使一些疯狂的子查询

SELECT SUM(students.count) as studentCount, School.district_id, students.race
FROM school_data_schools schools, 
school_data_race_ethnicity_raw students
WHERE shools.school_id = students.school_id
GROUP BY district_id, race
您可能还需要一个关于学校\数据\种族\种族\原始学校\ id的索引,而不是作为多列键的一部分

编辑不知道OP在寻找百分比细分,而不仅仅是总数

SELECT ((studentCount / districtTotal) * 100) as percentage, district_id, race

FROM(

SELECT SUM(students.count) as studentCount, Schools.district_id, students.race,
  (SELECT SUM(inStudents.count)
   FROM school_data_schools inSchools, 
    school_data_race_ethnicity_raw inStudents
   WHERE inSchools.school_id = inStudents.school_id
   AND inSchools.district_ID = Schools.district_id
   GROUP BY inSchools.district_id) as districtTotal

    FROM school_data_schools schools, 
    school_data_race_ethnicity_raw students

WHERE schools.school_id = students.school_id
GROUP BY district_id, race
) table1

这将运行得非常快,但仍然需要确保在学校数据、种族、种族、原始学校id上有一个索引,该索引不是多列索引的一部分。虽然我的测试用例很小,但你可以看到它的实际效果。这应该是你所需要的所有数据,以按地区汇总种族计数,不知道你为什么在原始版本中做这么多数学,因为它不需要实现你的目标,并且正在强制执行一些疯狂的子查询

SELECT SUM(students.count) as studentCount, School.district_id, students.race
FROM school_data_schools schools, 
school_data_race_ethnicity_raw students
WHERE shools.school_id = students.school_id
GROUP BY district_id, race
您可能还需要一个关于学校\数据\种族\种族\原始学校\ id的索引,而不是作为多列键的一部分

编辑不知道OP在寻找百分比细分,而不仅仅是总数

SELECT ((studentCount / districtTotal) * 100) as percentage, district_id, race

FROM(

SELECT SUM(students.count) as studentCount, Schools.district_id, students.race,
  (SELECT SUM(inStudents.count)
   FROM school_data_schools inSchools, 
    school_data_race_ethnicity_raw inStudents
   WHERE inSchools.school_id = inStudents.school_id
   AND inSchools.district_ID = Schools.district_id
   GROUP BY inSchools.district_id) as districtTotal

    FROM school_data_schools schools, 
    school_data_race_ethnicity_raw students

WHERE schools.school_id = students.school_id
GROUP BY district_id, race
) table1
这将运行得非常快,但仍然需要确保在学校数据、种族、种族、原始学校id上有一个索引,该索引不是多列索引的一部分。您可以看到它在运行,尽管我的测试用例非常小,但它似乎确实可以检查出来。

这是一个缓慢的过程,因为:

您没有在学校数据、种族、种族、原始数据和外部数据上使用索引,因此它正在扫描~84000行中的每一行 您使用的是相关子查询,这意味着您的复杂计算必须每行运行一次,即84000次。 最好的方法是不使用相关子查询,但如果不使用,那么要使其快速运行,您需要使用,以便整个内部查询以及通过其自身索引的其他部分可以仅使用索引快速运行。有关索引主题的精彩教程,请查看。它教会了我很多!现在,您的内部查询只使用学校数据、种族、种族和原始数据的年份索引,因此它必须通过为84000次计算中的每一次读取8000行来查找所需的其他内容。索引将大大加快这一速度,例如,在学校数据、种族、种族和原始数据上创建一个综合索引,您将发现它有助于:

CREATE index inner_composite ON school_data_race_ethnicity_raw (year, district_id, schoolid, count)
这将允许从索引中获取WHERE中使用的所有字段,然后是join字段,然后是select所需的字段。您应该看到它显示在解释结果的“键”列中。此外,如果您做对了,您将在最右边的列中看到“使用索引”,显示没有发生表访问,这要快几个数量级

您可以通过为查询提到的列添加索引负载来尝试快速和脏的样式,并查看在键列中拾取的内容。如果出现问题,请阅读您的查询以查看该表中的哪些其他列正在使用,然后添加一个新索引,并在右侧添加这些列,然后查看是否效果更好。记住,一旦发现什么是有效的,就要删除未使用的索引

MySQL不允许您直接索引列的总和,这是最快的方法,因此,除非您想移动到另一个DB,否则如果可以,这将总是有点慢。

这很慢,因为:

您没有在学校数据、种族、种族、原始数据和外部数据上使用索引,因此它正在扫描~84000行中的每一行 您使用的是相关子查询,这意味着您的复杂计算必须每行运行一次,即84000次。 最好的方法是不使用相关子查询,但如果不使用,那么要使其快速运行,您需要使用,以便整个内部查询以及通过其自身索引的其他部分可以仅使用索引快速运行。有关索引主题的精彩教程,请查看。它教会了我很多!现在,您的内部查询只使用学校数据、种族、种族和原始数据的年份索引,因此它必须通过为84000次计算中的每一次读取8000行来查找所需的其他内容。索引将大大加快这一速度,例如,在学校数据、种族、种族和原始数据上创建一个综合索引,您将发现它有助于:

CREATE index inner_composite ON school_data_race_ethnicity_raw (year, district_id, schoolid, count)
这将允许从索引中获取WHERE中使用的所有字段,然后是join字段,然后是select所需的字段。您应该看到它显示在解释结果的“键”列中。此外,如果您做对了,您将在最右边的列中看到“使用索引”,显示没有发生表访问,这要快几个数量级

您可以通过为查询提到的列添加索引负载来尝试快速和脏的样式,并查看在键列中拾取的内容。如果出现某些内容,请阅读查询以查看该表中的其他列 在使用中,然后添加一个新的索引,并在右侧添加这些列,看看是否效果更好。记住,一旦发现什么是有效的,就要删除未使用的索引


MySQL不允许您直接索引一列的总和,这是最快的方法,因此,除非您想移动到另一个DB。如果可以,这将总是有点慢。

定义很长,每个表有多少行?正如您所看到的,它实际上不使用任何键,而是使用一个临时表、文件排序和where。如何在一分钟内得到它?定义很长,每个表有多少行?正如您所看到的,它实际上不使用任何键,而是使用一个临时表、文件排序和where。我怎么能在一分钟内完成呢?我在寻找一个百分比而不仅仅是一个总数对不起,那不是你设定的目标的一部分。查看我的编辑。我要找的是一个百分比,而不仅仅是一个总数。对不起,这不是你设定的目标的一部分。查看我的编辑。