MySQL在左连接查询中分解左表以提高性能_Mysql_Performance_Join_Split

MySQL在左连接查询中分解左表以提高性能

mysql performance join

MySQL在左连接查询中分解左表以提高性能,mysql,performance,join,split,Mysql,Performance,Join,Split,我有以下MySQL查询： SELECT pool.username FROM pool LEFT JOIN sent ON pool.username = sent.username AND sent.campid = 'YA1LGfh9' WHERE sent.username IS NULL AND pool.gender = 'f' AND (`location` = 'united states' OR `location` = 'us' OR `location` = 'usa');

我有以下MySQL查询：

SELECT pool.username
FROM pool
LEFT JOIN sent ON pool.username = sent.username
AND sent.campid = 'YA1LGfh9'
WHERE sent.username IS NULL
AND pool.gender = 'f'
AND (`location` = 'united states' OR `location` = 'us' OR `location` = 'usa');

问题是池表包含数百万行，而此查询需要12分钟才能完成。我意识到在这个查询中，正在扫描整个左表池。池表有一个自动递增的id行

我希望将此查询拆分为多个查询，这样，我就不用扫描整个池表，而是一次扫描1000行，在下一个查询中，我将使用id列来跟踪1000-20002000-3000，依此类推

如何在查询中指定此项？如果你知道答案，请举例说明。多谢各位

以下是我的索引（如果有帮助）：

mysql> show index from main.pool;
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| pool  |          0 | PRIMARY  |            1 | id          | A         |     9275039 |     NULL | NULL   |      | BTREE      |         |
| pool  |          1 | username |            1 | username    | A         |     9275039 |     NULL | NULL   |      | BTREE      |         |
| pool  |          1 | source   |            1 | source      | A         |           1 |     NULL | NULL   |      | BTREE      |         |
| pool  |          1 | location |            1 | location    | A         |       38168 |     NULL | NULL   |      | BTREE      |         |
| pool  |          1 | pdex     |            1 | gender      | A         |           2 |     NULL | NULL   |      | BTREE      |         |
| pool  |          1 | pdex     |            2 | username    | A         |     9275039 |     NULL | NULL   |      | BTREE      |         |
| pool  |          1 | pdex     |            3 | id          | A         |     9275039 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
8 rows in set (0.00 sec)

mysql> show index from main.sent;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| sent  |          0 | PRIMARY  |            1 | primary_key | A         |         351 |     NULL | NULL   |      | BTREE      |         |
| sent  |          1 | username |            1 | username    | A         |         175 |     NULL | NULL   |      | BTREE      |         |
| sent  |          1 | sdex     |            1 | campid      | A         |           7 |     NULL | NULL   |      | BTREE      |         |
| sent  |          1 | sdex     |            2 | username    | A         |         351 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+

下面是对我的问题的解释：

----------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref   | rows    | Extra                                |
+----+-------------+-------+-------+---------------+------+---------+-------+---------+--------------------------------------+
|  1 | SIMPLE      | pool  | ref   | location,pdex | pdex | 5       | const | 6084332 | Using where                          |
|  1 | SIMPLE      | sent  | index | sdex          | sdex | 309     | NULL  |     351 | Using where; Using index; Not exists |
+----+-------------+-------+-------+---------------+------+---------+-------+---------+--------------------------------------+

以下是池表的结构：

| pool  | CREATE TABLE `pool` (
`id` int(20) NOT NULL AUTO_INCREMENT,
`username` varchar(50) CHARACTER SET utf8 NOT NULL,
`source` varchar(10) CHARACTER SET utf8 NOT NULL,
`gender` varchar(1) CHARACTER SET utf8 NOT NULL,
`location` varchar(50) CHARACTER SET utf8 NOT NULL,
PRIMARY KEY (`id`),
KEY `username` (`username`),
KEY `source` (`source`),
KEY `location` (`location`),
KEY `pdex` (`gender`,`username`,`id`)
) ENGINE=MyISAM AUTO_INCREMENT=9327026 DEFAULT CHARSET=latin1 |

以下是发送表的结构：

| sent  | CREATE TABLE `sent` (
`primary_key` int(50) NOT NULL AUTO_INCREMENT,
`username` varchar(50) NOT NULL,
`from` varchar(50) NOT NULL,
`campid` varchar(255) NOT NULL,
`timestamp` int(20) NOT NULL,
PRIMARY KEY (`primary_key`),
KEY `username` (`username`),
KEY `sdex` (`campid`,`username`)
) ENGINE=MyISAM AUTO_INCREMENT=352 DEFAULT CHARSET=latin1 |

这会产生语法错误，但开头的WHERE子句是im所追求的：

SELECT pool.username
FROM pool
WHERE id < 1000
LEFT JOIN sent ON pool.username = sent.username
AND sent.campid = 'YA1LGfh9'
WHERE sent.username IS NULL
AND pool.gender = 'f'
AND (location = 'united states' OR location = 'us' OR location = 'usa');

看起来它正在使用pool.location 可以尝试添加一个关于性别的索引，但可能没有多大帮助。将位置合理化为数据中的国家代码，并编制索引，这可能会很有用

但在我看来，要添加的第一个索引似乎是campid，这可能会严重减少它必须测试的记录数。

拆分查询听起来不是正确的方法

更好的方法是从现有查询中获取一些记录，发送消息，然后继续获取

您的查询可能受益于上的另一个复合索引

pool( location, gender, username )

这将允许从sdex和新索引运行完整的查询

如果您真的想要分割查询，一种简单的方法是

SELECT MIN(id), MAX(id) FROM pool

然后以1000的步长从最小值循环到最大值，并将id>=r和id

如果有间隙，这可能会返回0行，但一次不会返回超过1000行。池上的不同复合索引（包括id、位置、性别和用户名）可能有助于此查询。

此查询返回多少行？位置是池的一列吗？此查询当前返回超过600万行。位置是池的一列。有索引吗？那你怎么处理这600万行呢？我已经把我的索引和解释信息添加到了我的主要帖子中。行中的数据用于向用户发送消息。我提出了一个愚蠢的问题，但您需要池表中的所有数据吗？性别是pdex索引的一部分，而campid是sdex索引的一部分XI将这样做，但即使在我使用位置列之前，查询时间也是一样的。我真的很想在查询中拆分池表，以便可以使用多个查询。是否可以在查询中指定起始id和结束id？我不认为仅仅把它添加到查询的末尾就可以实现目标。我可以在何处/如何添加它，以便它在1个查询中只扫描池表的x-y id？我想执行类似操作，但它会产生一个错误：从池中选择pool.username，其中id<1000 LEFT JOIN sent ON pool.username=sent.username和sent.campid='YA1LGfh9'，Where sent.username为NULL，pool.gender='f'，location='united'us'或location='us'或location='usa'您需要将此条件添加到现有的WHERE子句中。其中sent.username为NULL，id…这并不能解决问题，因为在该子句之前将搜索整个池表。