如何优化具有2个内部连接且具有Distinct的Mysql查询?(InnoDB)
我有一个使用InnoDB存储引擎的查询 我想优化它。它需要太多的时间来执行。我的数据库里有500万个数据。现在执行需要250秒如何优化具有2个内部连接且具有Distinct的Mysql查询?(InnoDB),mysql,sql,performance,Mysql,Sql,Performance,我有一个使用InnoDB存储引擎的查询 我想优化它。它需要太多的时间来执行。我的数据库里有500万个数据。现在执行需要250秒 INSERT INTO dynamicgroups (adressid) SELECT SQL_NO_CACHE DISTINCT(addressid) FROM ( SELECT cluster_0.addressid FROM ( SELECT DISTINCT addressid FROM (
INSERT INTO dynamicgroups (adressid)
SELECT SQL_NO_CACHE DISTINCT(addressid) FROM (
SELECT cluster_0.addressid FROM (
SELECT DISTINCT addressid FROM (
SELECT group_all.addressid FROM (
SELECT g.addressid FROM table2.635_emadresmgroups g
INNER JOIN table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH))
AND f_0.birthday < CURDATE() )
) group_all
) AS groups
) AS cluster_0
INNER JOIN(
SELECT DISTINCT addressid FROM (
SELECT group_all.addressid FROM (
SELECT g.addressid FROM table2.635_emadresmgroups g
INNER JOIN table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (marriage_date = ''
OR marriage_date = '1900-01-01'
OR marriage_date = '0000-00-00' )
) group_all
) AS groups
) AS cluster_1 ON cluster_1.addressid = cluster_0.addressid
INNER JOIN(
SELECT DISTINCT addressid FROM (
SELECT group_all.addressid FROM (
SELECT g.addressid FROM table2.635_emadresmgroups g
INNER JOIN table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.city = '34' )
) group_all
) AS groups
) AS cluster_2 ON cluster_2.addressid = cluster_1.addressid
) AS t
即使解释运算符的实现不如其他运算符。。我建议您在查询时使用它 之后,您可以分析解释的结果并决定哪些列应该被索引 有关更多信息,我建议查看以下来源:
此外,最后两个选择看起来非常相似,也许您可以用它们创建一个临时表或视图,这样您就不必运行整个选择两次了?您的查询似乎都是此查询的变体:
SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() )
根据数据量,在分组依据之前进行筛选也有助于:
SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
table2.emaildata f_0
ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) OR
(marriage_date = '' OR marriage_date = '1900-01-01' OR marriage_date = '0000-00-00' ) OR
(f_0.city = '34' )
GROUP BY g.addressid
HAVING SUM(f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) > 0 AND
SUM(marriage_date = '' OR marriage_date = '1900-01-01' OR marriage_date = '0000-00-00' ) > 0 AND
SUM(f_0.city = '34' ) > 0;
结婚日期-使其为空,并使用NULL代替,等等,这将避免一个低效的或可能导致索引可用性
请提供“显示创建表”,以便我们可以评估当前索引
你正在运行哪个版本?直到最近,这种结构还是非常低效的:
FROM ( SELECT ... )
JOIN ( SELECT ... )
解决方法是将子查询放入tmp表并添加索引
这可能对您的情况有所帮助,因为您似乎正在使用联接进行筛选:将联接转为选择。。。在存在的位置选择*
请用英语描述查询试图做什么
另一种方法,基于Gordon关于使用公共选择的建议:将公共选择放入临时表中;添加索引,然后从中查询
FROM ( SELECT ... )
JOIN ( SELECT ... )