Mysql 将表连接到istelf非常慢
我有一个可以连接到自身的表。我想连接两次。以下是模式:Mysql 将表连接到istelf非常慢,mysql,performance,join,Mysql,Performance,Join,我有一个可以连接到自身的表。我想连接两次。以下是模式: CREATE TABLE `route_connections` ( `id` int(11) NOT NULL AUTO_INCREMENT, `from_route_iid` int(11) NOT NULL, `from_service_id` varchar(100) NOT NULL, `to_route_iid` int(11) NOT NULL, `to_service_id` varc
CREATE TABLE `route_connections` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`from_route_iid` int(11) NOT NULL,
`from_service_id` varchar(100) NOT NULL,
`to_route_iid` int(11) NOT NULL,
`to_service_id` varchar(100) NOT NULL,
PRIMARY KEY (`id`),
KEY `to_route` (`to_route_iid`),
KEY `from_route` (`from_route_iid`),
KEY `to_service` (`to_service_id`),
KEY `from_service` (`from_service_id`),
KEY `from_to_route` (`from_route_iid`,`to_route_iid`)
) ENGINE=InnoDB AUTO_INCREMENT=6798783 DEFAULT CHARSET=utf8
它大约有370万行
我的主要目标是找到一条使用3条路线(2条路线连接)的路径,知道允许的出发和到达路线列表(连接路线必须由查询确定)
路径:A路→ B路→ 丙路:
- 出发路线(已知名单,A)
c1(A→ (B)线路连接
- 连接路线(未知,B)
c2(B→ (C)route_连接
- 到达路线(已知列表,C)
路线_iid
s:c1.从
、c1.到
或c2.从
(相同)和c2.到
此外,我需要使用以下过滤器过滤每个服务\u id
:
service_id in (
select service_id from (
select service_id from calendar c
where c.start_date <= 20141109 and end_date >= 20141109
union
select service_id from calendar_dates cd
where cd.date = 20141109 and exception_type = 1
) x
where x.service_id not in (
select service_id from calendar_dates cd
where cd.date = 20141109 and exception_type = 2
)
)
但是我的目标是找到2个连接,所以我使用这个查询,它需要很多时间(也没有结果):
过去需要50秒,但我添加了从_到_的路线
索引,这将查询速度提高到了18-20秒
我还尝试不使用联接:
SELECT ...
FROM route_connections c1, route_connections c2
WHERE ...
但是它产生了完全相同的性能(我猜在内部它与连接完全相同)
我试图将内部连接更改为left join+aHAVING
子句,但情况更糟(正如预期的那样)
我尝试删除所有索引,但以下两个索引除外:
- 主键(
)id
- 来自路由iid的键
(
来自路由iid的键
,
到路由iid的键
)
+----+-------------+-------+-------+------------------------------------+----------------+---------+----------------------------------+-------+----------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+------------------------------------+----------------+---------+----------------------------------+-------+----------------------------------+
| 1 | SIMPLE | c1 | range | to_route,from_route,from_route_iid | from_route | 4 | NULL | 15464 | Using index condition; Using MRR |
| 1 | SIMPLE | c2 | ref | to_route,from_route,from_route_iid | from_route_iid | 4 | bicou_gtfs_paris.c1.to_route_iid | 1746 | Using index condition |
+----+-------------+-------+-------+------------------------------------+----------------+---------+----------------------------------+-------+----------------------------------+
将表连接到自身的正确方法是什么?我错过索引了吗
硬件为2014款macbook air,具有1.7GHz内核i7、8GB RAM和256GB SSD。软件是Mac OS X 10.10 Yosemite,带有MySQL 5.6.21好的,下面是我如何找到解决方案的:
select to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
=>15471行
select to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
group by to_route_iid
=>97行
到达路线也是如此,131行分组,25427行分组
所以这个查询:
select c1.from_route_iid, c2.from_route_iid, c2.to_route_iid
from (
select from_route_iid, to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
group by to_route_iid
) c1, route_connections c2
where c2.from_route_iid = c1.to_route_iid
and c2.to_route_iid in (745, 744, 1096, 1093, 743, 317, 742, 13, 316)
group by c2.from_route_iid, c2.to_route_iid
运行时间为145毫秒。很好,今天早上我从2分钟开始:)这个表上的不同索引是什么?它们显示在
createtable
语句中。基本上所有字段都是单独索引的,有一个从/到路由ID的索引,我以前在4个字段(PK除外)上有一个唯一的索引,但我删除了它以查看对性能的影响。当您有一个没有联接的工作查询时,为什么要使用联接?您的问题提到选择c1.*,c2.
。请考虑更改它,而不是枚举您的结果集中需要的确切列。知道哪些列是必需的,这是使用所谓的复合覆盖索引优化查询的一个良好开端。@OllieJones:对,但它稍微复杂一些。此外,我将选择PK之外的每个字段,因此它几乎与*
相同。我已经更新了我的问题。
select to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
group by to_route_iid
select c1.from_route_iid, c2.from_route_iid, c2.to_route_iid
from (
select from_route_iid, to_route_iid
from route_connections
where from_route_iid in (864, 865, 495, 494, 459, 54, 458)
group by to_route_iid
) c1, route_connections c2
where c2.from_route_iid = c1.to_route_iid
and c2.to_route_iid in (745, 744, 1096, 1093, 743, 317, 742, 13, 316)
group by c2.from_route_iid, c2.to_route_iid