Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/sql/78.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何优化具有2个内部连接且具有Distinct的Mysql查询?(InnoDB)_Mysql_Sql_Performance - Fatal编程技术网

如何优化具有2个内部连接且具有Distinct的Mysql查询?(InnoDB)

如何优化具有2个内部连接且具有Distinct的Mysql查询?(InnoDB),mysql,sql,performance,Mysql,Sql,Performance,我有一个使用InnoDB存储引擎的查询 我想优化它。它需要太多的时间来执行。我的数据库里有500万个数据。现在执行需要250秒 INSERT INTO dynamicgroups (adressid) SELECT SQL_NO_CACHE DISTINCT(addressid) FROM ( SELECT cluster_0.addressid FROM ( SELECT DISTINCT addressid FROM (

我有一个使用InnoDB存储引擎的查询

我想优化它。它需要太多的时间来执行。我的数据库里有500万个数据。现在执行需要250秒

INSERT INTO dynamicgroups (adressid) 

    SELECT SQL_NO_CACHE DISTINCT(addressid) FROM (
        SELECT cluster_0.addressid FROM (
            SELECT DISTINCT addressid FROM (
                SELECT group_all.addressid FROM (
                    SELECT g.addressid FROM table2.635_emadresmgroups g 
                        INNER JOIN table2.emaildata f_0
                               ON f_0.addressid = g.addressid
                        WHERE  (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH))
                            AND f_0.birthday < CURDATE() )
                ) group_all
            ) AS groups

        ) AS cluster_0

        INNER JOIN(
            SELECT DISTINCT addressid FROM (
                SELECT group_all.addressid FROM (
                    SELECT g.addressid FROM table2.635_emadresmgroups g 
                        INNER JOIN table2.emaildata f_0
                               ON f_0.addressid = g.addressid
                        WHERE  (marriage_date = ''
                             OR marriage_date = '1900-01-01'
                             OR marriage_date = '0000-00-00' )
                ) group_all
            ) AS groups
        ) AS cluster_1 ON cluster_1.addressid = cluster_0.addressid

        INNER JOIN(
            SELECT DISTINCT addressid FROM (
                SELECT group_all.addressid FROM (
                    SELECT g.addressid FROM table2.635_emadresmgroups g 
                        INNER JOIN table2.emaildata f_0
                                ON f_0.addressid = g.addressid
                        WHERE  (f_0.city = '34' )
                ) group_all
            ) AS groups
        ) AS cluster_2 ON cluster_2.addressid = cluster_1.addressid 
    ) AS t

即使解释运算符的实现不如其他运算符。。我建议您在查询时使用它

之后,您可以分析解释的结果并决定哪些列应该被索引

有关更多信息,我建议查看以下来源:


此外,最后两个选择看起来非常相似,也许您可以用它们创建一个临时表或视图,这样您就不必运行整个选择两次了?

您的查询似乎都是此查询的变体:

SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
     table2.emaildata f_0
     ON f_0.addressid = g.addressid
WHERE  (f_0.birthday > date(DATE_SUB(NOW(),INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() )
根据数据量,在分组依据之前进行筛选也有助于:

SELECT g.addressid
FROM table2.635_emadresmgroups g INNER JOIN
     table2.emaildata f_0
     ON f_0.addressid = g.addressid
WHERE (f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) OR
      (marriage_date = ''  OR marriage_date = '1900-01-01' OR marriage_date = '0000-00-00' ) OR
      (f_0.city = '34' )
GROUP BY g.addressid
HAVING SUM(f_0.birthday > date(DATE_SUB(NOW(), INTERVAL 18 MONTH)) AND f_0.birthday < CURDATE() ) > 0 AND
       SUM(marriage_date = '' OR marriage_date = '1900-01-01'  OR marriage_date = '0000-00-00' ) > 0 AND
       SUM(f_0.city = '34' ) > 0;
结婚日期-使其为空,并使用NULL代替,等等,这将避免一个低效的或可能导致索引可用性

请提供“显示创建表”,以便我们可以评估当前索引

你正在运行哪个版本?直到最近,这种结构还是非常低效的:

FROM ( SELECT ... )
JOIN ( SELECT ... )
解决方法是将子查询放入tmp表并添加索引

这可能对您的情况有所帮助,因为您似乎正在使用联接进行筛选:将联接转为选择。。。在存在的位置选择*

请用英语描述查询试图做什么

另一种方法,基于Gordon关于使用公共选择的建议:将公共选择放入临时表中;添加索引,然后从中查询

FROM ( SELECT ... )
JOIN ( SELECT ... )