SELECT语句优化MySQL

SELECT语句优化MySQL,mysql,sql,optimization,query-optimization,Mysql,Sql,Optimization,Query Optimization,我正在寻找一种方法,使我的SELECT查询比现在更快,因为我觉得应该可以使它更快 这是一个问题 SELECT r.id_customer, ROUND(AVG(tp.percentile_weighted), 2) AS percentile FROM tag_rating AS r USE INDEX (value_date_add) JOIN tag_product AS tp ON (tp.id_pair = r.id_pair) WHERE r.value = 1 AND r.date

我正在寻找一种方法,使我的SELECT查询比现在更快,因为我觉得应该可以使它更快

这是一个问题

SELECT r.id_customer, ROUND(AVG(tp.percentile_weighted), 2) AS percentile
FROM tag_rating AS r USE INDEX (value_date_add)
JOIN tag_product AS tp ON (tp.id_pair = r.id_pair)
WHERE 
r.value = 1 AND
r.date_add > '2020-08-08 11:56:00'
GROUP BY r.id_customer
这里是解释选择

+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| id | select_type | table | type   | possible_keys  | key            | key_len | ref           | rows   | Extra                                                               |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | r     | ref    | value_date_add | value_date_add | 1       | const         | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | tp    | eq_ref | PRIMARY        | PRIMARY        | 4       | dev.r.id_pair | 1      |                                                                     |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys  | key            | key_len | ref   | rows   | Extra                                                               |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | r     | ref  | value_date_add | value_date_add | 1       | const | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| id | select_type | table | type   | possible_keys              | key                        | key_len | ref                              | rows   | Extra                    |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1  | SIMPLE      | r     | range  | id_customer_value_date_add | id_customer_value_date_add | 10      |                                  | 558906 | Using where; Using index |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1  | SIMPLE      | tp    | eq_ref | PRIMARY,status             | PRIMARY                    | 4       | dev.r.id_pair | 1      | Using where              |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
现在桌子都放好了

CREATE TABLE `tag_product` (
  `id_pair` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `id_product` int(10) unsigned NOT NULL,
  `id_user_tag` int(10) unsigned NOT NULL,
  `status` tinyint(3) NOT NULL,
  `percentile` decimal(8,4) unsigned NOT NULL,
  `percentile_weighted` decimal(8,4) unsigned NOT NULL,
  `elo` int(10) unsigned NOT NULL,
  `date_add` datetime NOT NULL,
  `date_upd` datetime NOT NULL,
  PRIMARY KEY (`id_pair`),
  UNIQUE KEY `id_product_id_user_tag` (`id_product`,`id_user_tag`),
  KEY `status` (`status`),
  KEY `id_user_tag` (`id_user_tag`),
  CONSTRAINT `tag_product_ibfk_5` FOREIGN KEY (`id_user_tag`) REFERENCES `user_tag` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
CREATE TABLE `tag_rating` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `id_customer` int(10) unsigned NOT NULL,
  `id_pair` int(10) unsigned NOT NULL,
  `id_duel` int(10) unsigned NOT NULL,
  `value` tinyint(4) NOT NULL,
  `date_add` datetime NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `id_duel_id_pair` (`id_duel`,`id_pair`),
  KEY `id_pair_id_customer` (`id_pair`,`id_customer`),
  KEY `value` (`value`),
  KEY `value_date_add` (`value`,`date_add`),
  KEY `id_customer_value_date_add` (`id_customer`,`value`,`date_add`),
  CONSTRAINT `tag_rating_ibfk_3` FOREIGN KEY (`id_pair`) REFERENCES `tag_product` (`id_pair`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `tag_rating_ibfk_6` FOREIGN KEY (`id_duel`) REFERENCES `tag_rating_duel` (`id_duel`) ON DELETE CASCADE ON UPDATE CASCADE,
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
tag_表产品约有25万行,tag_评级约有100万行

我的问题是SQL查询在我的机器上平均需要0.8秒。我想让它在0.5s的理想状态下,同时也假设桌子可以变大10倍。使用的行数应该大致相同,因为我有一个日期条件(我只想要少于一个月的行)

仅仅通过一些技巧(也就是不重组我的表)就可以加快速度吗?当我稍微修改(不要加入较小的表)语句时

SELECT r.id_customer, COUNT(*)
FROM tag_rating AS r USE INDEX (value_date_add)
WHERE 
r.value = 1 AND
r.date_add > '2020-08-08 11:56:00'
GROUP BY r.id_customer;
这里是解释选择

+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| id | select_type | table | type   | possible_keys  | key            | key_len | ref           | rows   | Extra                                                               |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | r     | ref    | value_date_add | value_date_add | 1       | const         | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | tp    | eq_ref | PRIMARY        | PRIMARY        | 4       | dev.r.id_pair | 1      |                                                                     |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys  | key            | key_len | ref   | rows   | Extra                                                               |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | r     | ref  | value_date_add | value_date_add | 1       | const | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| id | select_type | table | type   | possible_keys              | key                        | key_len | ref                              | rows   | Extra                    |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1  | SIMPLE      | r     | range  | id_customer_value_date_add | id_customer_value_date_add | 10      |                                  | 558906 | Using where; Using index |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1  | SIMPLE      | tp    | eq_ref | PRIMARY,status             | PRIMARY                    | 4       | dev.r.id_pair | 1      | Using where              |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
大约需要0.25秒,这很好。因此,连接使其速度降低了3倍。这是不可避免的吗?我觉得因为我是通过主键加入的,所以查询速度不应该慢3倍

---更新---

这其实是我的疑问。不同id_客户值的数量约为1000,预计还会增加,值为1的行数正好是一半。到目前为止,根据评级表中的行数,查询性能似乎线性降低

在id\u customer\u value\u date\u add或value\u id\u customer\u date\u add索引末尾使用添加id\u对没有帮助

SELECT r.id_customer, ROUND(AVG(tp.percentile_weighted), 2) AS percentile
FROM tag_rating AS r USE INDEX (id_customer_value_date_add)
JOIN tag_product AS tp ON (tp.id_pair = r.id_pair)
WHERE 
r.value = 1 AND
r.id_customer IN (2593179,1461878,2318871,2654090,2840415,2852531,2987432,3473275,3960453,3961798,4129734,4191571,4202912,4204817,4211263,4248789,765650,1341317,1430380,2116196,3367674,3701901,3995273,4118307,4136114,4236589,783262,913493,1034296,2626574,3574634,3785772,2825128,4157953,3331279,4180367,4208685,4287879,1038898,1445750,1975108,3658055,4185296,4276189,428693,4248631,1892448,3773855,2901524,3830868,3934786) AND
r.date_add > '2020-08-08 11:56:00'
GROUP BY r.id_customer
这是解释选择

+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| id | select_type | table | type   | possible_keys  | key            | key_len | ref           | rows   | Extra                                                               |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | r     | ref    | value_date_add | value_date_add | 1       | const         | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | tp    | eq_ref | PRIMARY        | PRIMARY        | 4       | dev.r.id_pair | 1      |                                                                     |
+----+-------------+-------+--------+----------------+----------------+---------+---------------+--------+---------------------------------------------------------------------+
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| id | select_type | table | type | possible_keys  | key            | key_len | ref   | rows   | Extra                                                               |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
| 1  | SIMPLE      | r     | ref  | value_date_add | value_date_add | 1       | const | 449502 | Using index condition; Using where; Using temporary; Using filesort |
+----+-------------+-------+------+----------------+----------------+---------+-------+--------+---------------------------------------------------------------------+
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| id | select_type | table | type   | possible_keys              | key                        | key_len | ref                              | rows   | Extra                    |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1  | SIMPLE      | r     | range  | id_customer_value_date_add | id_customer_value_date_add | 10      |                                  | 558906 | Using where; Using index |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+
| 1  | SIMPLE      | tp    | eq_ref | PRIMARY,status             | PRIMARY                    | 4       | dev.r.id_pair | 1      | Using where              |
+----+-------------+-------+--------+----------------------------+----------------------------+---------+----------------------------------+--------+--------------------------+

任何提示都将不胜感激。谢谢

尝试使用相关子查询编写查询:

SELECT r.id_customer,
       (SELECT ROUND(AVG(tp.percentile_weighted), 2)
        FROM tag_product tp 
        WHERE tp.id_pair = r.id_pair
       ) AS percentile
FROM tag_rating AS r 
WHERE r.value = 1 AND
      r.date_add > '2020-08-08 11:56:00';
这消除了应该更快的外部聚合

INDEX(value, date_add, id_customer, id_pair)
将是“覆盖”,为这两个查询提供额外的性能提升。还有戈登的公式

同时,摆脱这些:

KEY `value` (`value`),
KEY `value_date_add` (`value`,`date_add`),
因为它们可能会妨碍优化器选择新索引。使用这些索引的任何其他查询都可以轻松地使用新索引


如果您没有使用
tag_rating.id
,请将其删除,并将
UNIQUE
升级为
主键

,我不确定是否理解。我的查询计算每个id_客户从tag_product中选择的行加权的百分比_的平均值。您的查询没有分组。我错过什么了吗?@honzaik。它有一个相关的子查询,因此它只计算配对匹配时的平均值。但您的查询返回r.value&&r.date\u添加匹配的行数。我的查询返回“唯一id_客户数”行。它们不一样。唯一用户的数量只有几百个。评分中的行数为数十万谢谢。添加索引使查询速度提高了20%。我还试着按照你的建议去掉id列,当与我以前的索引一起使用时,它产生了更大的不同。唯一我不明白的是,为什么它仅仅通过删除一个列就变得更快(我甚至不需要创建一个主索引,而且它已经更快了)。我还应该提到,在现实中,我在r.id_customer in上也有一个where条件(比如100个id),并且索引也被扩展以匹配这个条件。但是查询仍然很慢,所以我建议它使问题更简单。使查询更简单会使它有所不同,我们会就更简单的查询向您提供建议。对查询的任何更改,即使是很小的更改,都可能使对“更简单”查询有效的建议无效。如果你想在<>代码> <代码>中提供建议,请提供这个查询。@ HunZaik——如果我理解你的修正,这可能会有帮助:<代码>索引(value,ID-Cub,DATEYADD,IDAGION)——给予优化器另一个考虑的指标。对不起,延迟,我已经更新了这个问题。将id对添加到索引的末尾似乎没有任何作用,因为explain说使用的密钥无论有无都是相同的。到目前为止,查询时间与tag_rating表中的行数呈线性增长(实际上,tag_产品和tag_rating以相同的速度增长-对于每个添加的tag_产品行,有5行添加到tag_rating中)@honzaik-
EXPLAIN
没有显示在
键中。在
Extra
列中,
使用index
表示索引“覆盖”。“覆盖”的好处是,它只需查看索引的BTree,而不必同时接触数据的BTree。