MySQL查询仅在使用ORDER BY field DESC和LIMIT时速度较慢 概述

MySQL查询仅在使用ORDER BY field DESC和LIMIT时速度较慢 概述,mysql,indexing,innodb,Mysql,Indexing,Innodb,我正在运行MySQL 5.7.30-33,我遇到了一个问题,似乎MySQL在运行查询时使用了错误的索引。使用现有查询,我将获得3秒的查询时间。但是,只需通过更改顺序、取消限制或强制使用索引,我就可以获得0.01秒的查询时间。不幸的是,我需要坚持我的原始查询(它被烘焙到一个应用程序中),因此如果这种差异可以在模式/索引中得到解决,那就太好了 设置/问题 我的表格结构如下: CREATE TABLE `referrals` ( `__id` int(11) unsigned NOT NULL A

我正在运行MySQL 5.7.30-33,我遇到了一个问题,似乎MySQL在运行查询时使用了错误的索引。使用现有查询,我将获得3秒的查询时间。但是,只需通过更改顺序、取消限制或强制使用索引,我就可以获得0.01秒的查询时间。不幸的是,我需要坚持我的原始查询(它被烘焙到一个应用程序中),因此如果这种差异可以在模式/索引中得到解决,那就太好了

设置/问题 我的表格结构如下:

CREATE TABLE `referrals` (
  `__id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `systemcreated` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `referrerid` mediumtext COLLATE utf8mb4_unicode_ci,
  `referrersiteid` varchar(50) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  ... lots more mediumtext fields ...
  PRIMARY KEY (`__id`),
  KEY `systemcreated` (`systemcreated`,`referrersiteid`,`__id`)
) ENGINE=InnoDB AUTO_INCREMENT=53368 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci ROW_FORMAT=COMPRESSED
该表只有~55k行,但非常宽,因为某些字段包含巨大的BLOB:

mysql> show table status like 'referrals'\G;
*************************** 1. row ***************************
           Name: referrals
         Engine: InnoDB
        Version: 10
     Row_format: Compressed
           Rows: 45641
 Avg_row_length: 767640
    Data_length: 35035897856
Max_data_length: 0
   Index_length: 3653632
      Data_free: 3670016
 Auto_increment: 54008
    Create_time: 2020-12-12 12:46:14
    Update_time: 2020-12-12 17:50:28
     Check_time: NULL
      Collation: utf8mb4_unicode_ci
       Checksum: NULL
 Create_options: row_format=COMPRESSED
        Comment: 
1 row in set (0.00 sec)
我的客户的应用程序使用此查询表,不幸的是,无法轻松更改:

SELECT  *
    FROM  referrals
    WHERE  `systemcreated` LIKE 'XXXXXX%'
      AND  `referrersiteid` LIKE 'XXXXXXXXXXXX%'
    order by  __id desc
    limit  16;
这导致查询时间约为3秒

解释如下:

+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table       | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | referrals   | NULL       | index | systemcreated | PRIMARY | 4       | NULL |   32 |     5.56 | Using where |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
| id | select_type | table       | partitions | type  | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | referrals   | NULL       | range | systemcreated | systemcreated | 406     | NULL | 2086 |    11.11 | Using index condition; Using filesort |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
请注意,它对查询使用的是主键,而不是
systemcreated
索引

实验1 如果我将查询更改为使用ASC而不是DESC:

SELECT  *
    FROM  referrals
    WHERE  `systemcreated` LIKE 'XXXXXX%'
      AND  `referrersiteid` LIKE 'XXXXXXXXXXXX%'
    order by  __id asc
    limit  16;
然后需要0.01秒,解释看起来是一样的:

+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table       | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | referrals   | NULL       | index | systemcreated | PRIMARY | 4       | NULL |   32 |     5.56 | Using where |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
实验2 如果我将查询更改为使用ORDER BY _id DESC,但删除限制:

SELECT  *
    FROM  referrals
    WHERE  `systemcreated` LIKE 'XXXXXX%'
      AND  `referrersiteid` LIKE 'XXXXXXXXXXXX%'
    order by  __id desc;
然后还需要0.01秒,解释如下:

+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table       | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | referrals   | NULL       | index | systemcreated | PRIMARY | 4       | NULL |   32 |     5.56 | Using where |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
| id | select_type | table       | partitions | type  | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | referrals   | NULL       | range | systemcreated | systemcreated | 406     | NULL | 2086 |    11.11 | Using index condition; Using filesort |
+----+-------------+-------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
实验3 或者,如果我强制原始查询使用
systemcreated
索引,那么它也会给出0.01秒的查询时间。下面是解释:

mysql> explain     SELECT  *
    FROM  referrals USE INDEX (systemcreated)
    WHERE  `systemcreated` LIKE 'XXXXXX%'
      AND  `referrersiteid` LIKE 'XXXXXXXXXXXX%'
    order by  __id desc
    limit  16;

+----+-------------+--------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
| id | select_type | table        | partitions | type  | possible_keys | key           | key_len | ref  | rows | filtered | Extra                                 |
+----+-------------+--------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
|  1 | SIMPLE      | referrals    | NULL       | range | systemcreated | systemcreated | 406     | NULL | 2086 |    11.11 | Using index condition; Using filesort |
+----+-------------+--------------+------------+-------+---------------+---------------+---------+------+------+----------+---------------------------------------+
实验4 最后,如果我使用原始ORDER BY _id DESC LIMIT 16,但选择较少的字段,那么它也会在0.01秒后返回!下面是解释:

mysql> explain     SELECT  field1, field2, field3, field4, field5
    FROM  referrals
    WHERE  `systemcreated` LIKE 'XXXXXX%'
      AND  `referrersiteid` LIKE 'XXXXXXXXXXXX%'
    order by  __id desc
    limit  16;

+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table       | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | referrals   | NULL       | index | systemcreated | PRIMARY | 4       | NULL |   32 |     5.56 | Using where |
+----+-------------+-------------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
总结 因此,唯一表现不佳的组合是
orderby\uu id DESC LIMIT 16

我想我的索引设置正确。我通过
systemcreated
refererSiteId
字段进行查询,并按_id排序,因此我有一个定义为(systemcreated,refererSiteId,_id)的索引,但MySQL似乎仍在使用主键

有什么建议吗?

  • “平均行长:767640”;大量的
    MEDIUMTEXT
    。一行限制在8KB左右;溢出进入“记录外”块。读取这些块需要额外的磁盘点击

  • SELECT*
    将到达所有这些fat列。总读取量约为50次(每次16KB)。这需要时间

  • (Exp 4)
    SELECT a、b、c、d运行得更快,因为它不需要每行获取所有~50个块

  • 您的二级索引(
    systemcreated
    refererSiteId
    \uu id
    ),只有第一列有用。这是因为像“xxx%”一样创建了
    系统
    。这是一个“范围”。一旦达到某个范围,该指数的其余部分将无效。除了

  • “索引提示”(
    使用索引(…)
    )今天可能会有所帮助,但当数据分布发生变化时,明天可能会变得更糟

  • 如果您无法摆脱
    中的通配符,如
    ,我建议使用以下两个索引:

      INDEX(systemcreated)
      INDEX(referrersiteid)
    
  • 真正的加速可以通过将查询翻过来实现。也就是说,首先找到16个ID,然后查找所有那些庞大的列:

      SELECT  r2...   -- whatever you want
          FROM  
          (
              SELECT  __id
                  FROM  referrals
                  WHERE  `systemcreated` LIKE 'XXXXXX%'
                    AND  `referrersiteid` LIKE 'XXXXXXXXXXXX%'
                  order by  __id desc
                  limit  16 
          ) AS r1
          JOIN  referrals r2 USING(__id)
          ORDER BY  __id DESC   -- yes, this needs repeating 
    
并保留现有的3列二级索引。即使它必须扫描超过16行才能找到所需的16行,它的体积也要小得多。这意味着子查询(“派生表”)的速度将适中。然后外部查询仍将有16个查找——可能需要读取16*50个块。读取的块总数仍将大大减少

orderby
上的
ASC
DESC
之间很少有明显区别

为什么优化器选择PK而不是看起来更好的二级索引?PK可能是最好的,尤其是当16行位于表的“end”(DESC)时。但是如果它必须扫描整个表而没有找到16行,那么这将是一个糟糕的选择

同时,通配符测试使得二级索引仅部分有用。优化器根据不充分的统计信息做出决策。有时感觉就像是掷硬币

如果您使用由内而外的重新格式,那么我建议使用以下两个复合索引——优化器可以为派生表在它们之间做出半智能、半正确的选择:

INDEX(systemcreated, referrersiteid, __id),
INDEX(referrersiteid, systemcreated, __id)
它将继续说“filesort”,但不要担心;它只对16行进行排序

而且,请记住,
SELECT*
会影响性能。(尽管你可能无法解决这个问题。)

  • “平均行长:767640”;大量的
    MEDIUMTEXT
    。一行限制在8KB左右;溢出进入“记录外”块。读取这些块需要额外的磁盘点击

  • SELECT*
    将到达所有这些fat列。总读取量约为50次(每次16KB)。这需要时间

  • (Exp 4)
    SELECT a、b、c、d运行得更快,因为它不需要每行获取所有~50个块

  • 您的二级索引(
    systemcreated
    refererSiteId
    \uu id
    ),只有第一列有用。这是因为像“xxx%”一样创建了
    系统
    。这是一个“范围”。一旦达到某个范围,该指数的其余部分将无效。除了

  • “索引提示”(
    使用索引(…)
    )今天可能会有所帮助,但当数据分布发生变化时,明天可能会变得更糟

  • 如果您无法摆脱
    中的通配符,如
    ,我建议使用以下两个索引:

      INDEX(systemcreated)
      INDEX(referrersiteid)
    
  • 真正的加速可以通过将查询翻过来实现。也就是说,首先找到16个ID,然后查找所有那些庞大的列:

      SELECT  r2...   -- whatever you want
          FROM  
          (
              SELECT  __id
                  FROM  referrals
                  WHERE  `systemcreated` LIKE 'XXXXXX%'
                    AND  `referrersiteid` LIKE 'XXXXXXXXXXXX%'
                  order by  __id desc
                  limit  16 
          ) AS r1
          JOIN  referrals r2 USING(__id)
          ORDER BY  __id DESC   -- yes, this needs repeating 
    
并保留现有的3列二级索引。即使它必须扫描超过16行才能找到所需的16行,它的体积也要小得多。这意味着子查询(“派生”ta