Mysql 将表连接到istelf非常慢_Mysql_Performance_Join

Mysql 将表连接到istelf非常慢

mysql performance join

Mysql 将表连接到istelf非常慢,mysql,performance,join,Mysql,Performance,Join,我有一个可以连接到自身的表。我想连接两次。以下是模式： CREATE TABLE `route_connections` ( `id` int(11) NOT NULL AUTO_INCREMENT, `from_route_iid` int(11) NOT NULL, `from_service_id` varchar(100) NOT NULL, `to_route_iid` int(11) NOT NULL, `to_service_id` varc

我有一个可以连接到自身的表。我想连接两次。以下是模式：

CREATE TABLE `route_connections` (
    `id` int(11) NOT NULL AUTO_INCREMENT,
    `from_route_iid` int(11) NOT NULL,
    `from_service_id` varchar(100) NOT NULL,
    `to_route_iid` int(11) NOT NULL,
    `to_service_id` varchar(100) NOT NULL,
    PRIMARY KEY (`id`),
    KEY `to_route` (`to_route_iid`),
    KEY `from_route` (`from_route_iid`),
    KEY `to_service` (`to_service_id`),
    KEY `from_service` (`from_service_id`),
    KEY `from_to_route` (`from_route_iid`,`to_route_iid`)
) ENGINE=InnoDB AUTO_INCREMENT=6798783 DEFAULT CHARSET=utf8

它大约有370万行

我的主要目标是找到一条使用3条路线（2条路线连接）的路径，知道允许的出发和到达路线列表（连接路线必须由查询确定）

路径：A路→ B路→ 丙路:

出发路线（已知名单，A）
```
线路连接
```
c1（A→ （B）
连接路线（未知，B）
```
route_连接
```
c2（B→ （C）
到达路线（已知列表，C）

因此，我需要选择三条

路线_iid

c1.从

、

c1.到

或

c2.从

（相同）和

c2.到

此外，我需要使用以下过滤器过滤每个

服务\u id

：

service_id in (
    select service_id from (
        select service_id from calendar c
            where c.start_date <= 20141109 and end_date >= 20141109 

        union

        select service_id from calendar_dates cd 
            where cd.date = 20141109 and exception_type = 1 
    ) x 
    where x.service_id not in (
        select service_id from calendar_dates cd 
        where cd.date = 20141109 and exception_type = 2
    )
)

但是我的目标是找到2个连接，所以我使用这个查询，它需要很多时间（也没有结果）：

过去需要50秒，但我添加了

从_到_的路线

索引，这将查询速度提高到了18-20秒

我还尝试不使用联接：

SELECT ...
FROM route_connections c1, route_connections c2
WHERE ...

但是它产生了完全相同的性能（我猜在内部它与连接完全相同）

我试图将内部连接更改为left join+a

HAVING

子句，但情况更糟（正如预期的那样）

我尝试删除所有索引，但以下两个索引除外：

主键（
```
id
```
）

来自路由iid的键

（来自路由iid的键，到路由iid的键）


结果是一样的，大约18-20秒
下面是解释：
+----+-------------+-------+-------+------------------------------------+----------------+---------+----------------------------------+-------+----------------------------------+
| id | select_type | table | type  | possible_keys                      | key            | key_len | ref                              | rows  | Extra                            |
+----+-------------+-------+-------+------------------------------------+----------------+---------+----------------------------------+-------+----------------------------------+
|  1 | SIMPLE      | c1    | range | to_route,from_route,from_route_iid | from_route     | 4       | NULL                             | 15464 | Using index condition; Using MRR |
|  1 | SIMPLE      | c2    | ref   | to_route,from_route,from_route_iid | from_route_iid | 4       | bicou_gtfs_paris.c1.to_route_iid |  1746 | Using index condition            |
+----+-------------+-------+-------+------------------------------------+----------------+---------+----------------------------------+-------+----------------------------------+

将表连接到自身的正确方法是什么？我错过索引了吗
硬件为2014款macbook air，具有1.7GHz内核i7、8GB RAM和256GB SSD。

软件是Mac OS X 10.10 Yosemite，带有MySQL 5.6.21
好的，下面是我如何找到解决方案的：
select to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)

=>15471行
select to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
group by to_route_iid

=>97行
到达路线也是如此，131行分组，25427行分组
所以这个查询：
select c1.from_route_iid, c2.from_route_iid, c2.to_route_iid
from (
    select from_route_iid, to_route_iid
    from route_connections
    where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
    group by to_route_iid
) c1, route_connections c2
where c2.from_route_iid = c1.to_route_iid
and c2.to_route_iid in (745, 744, 1096, 1093, 743, 317, 742, 13, 316)
group by c2.from_route_iid, c2.to_route_iid

运行时间为145毫秒。很好，今天早上我从2分钟开始：）
这个表上的不同索引是什么？它们显示在createtable
语句中。基本上所有字段都是单独索引的，有一个从/到路由ID的索引，我以前在4个字段（PK除外）上有一个唯一的索引，但我删除了它以查看对性能的影响。当您有一个没有联接的工作查询时，为什么要使用联接？您的问题提到选择c1.*，c2.
。请考虑更改它，而不是枚举您的结果集中需要的确切列。知道哪些列是必需的，这是使用所谓的复合覆盖索引优化查询的一个良好开端。@OllieJones：对，但它稍微复杂一些。此外，我将选择PK之外的每个字段，因此它几乎与*相同。我已经更新了我的问题。
select to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
group by to_route_iid

select c1.from_route_iid, c2.from_route_iid, c2.to_route_iid
from (
    select from_route_iid, to_route_iid
    from route_connections
    where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
    group by to_route_iid
) c1, route_connections c2
where c2.from_route_iid = c1.to_route_iid
and c2.to_route_iid in (745, 744, 1096, 1093, 743, 317, 742, 13, 316)
group by c2.from_route_iid, c2.to_route_iid