Mysql SQL性能问题:查找路由
我正在努力解决SQL查询的性能问题 我坐火车旅行了5个站,名字是a-B-C-D-E。 一位乘客预订了一张B-C-D车票。 我需要找回我的乘客去的所有车站 我所储存的:Mysql SQL性能问题:查找路由,mysql,sql,mariadb,Mysql,Sql,Mariadb,我正在努力解决SQL查询的性能问题 我坐火车旅行了5个站,名字是a-B-C-D-E。 一位乘客预订了一张B-C-D车票。 我需要找回我的乘客去的所有车站 我所储存的: JOURNEY +----+--------------------+-------------------+-------------------+-----------------+ | id | departure_datetime | arrival_datetime | departure_station | arri
JOURNEY
+----+--------------------+-------------------+-------------------+-----------------+
| id | departure_datetime | arrival_datetime | departure_station | arrival_station |
+----+--------------------+-------------------+-------------------+-----------------+
| 1 | 2018-01-01 06:00 | 2018-01-01 10:00 | A | E |
+----+--------------------+-------------------+-------------------+-----------------+
BOOKING
+----+------------+-------------------+-----------------+
| id | journey_id | departure_station | arrival_station |
+----+------------+-------------------+-----------------+
| 1 | 1 | B | D |
+----+------------+-------------------+-----------------+
LEG
+----+------------+-------------------+-----------------+------------------+------------------+
| id | journey_id | departure_station | arrival_station | departure_time | arrival_time |
+----+------------+-------------------+-----------------+------------------+------------------+
| 1 | 1 | A | B | 2018-01-01 06:00 | 2018-01-01 07:00 |
| 2 | 1 | B | C | 2018-01-01 07:00 | 2018-01-01 08:00 |
| 3 | 1 | C | D | 2018-01-01 08:00 | 2018-01-01 09:00 |
| 4 | 1 | D | E | 2018-01-01 09:00 | 2018-01-01 10:00 |
+----+------------+-------------------+-----------------+------------------+------------------+
我发现检索站点的唯一方法是:
select b.id as booking, l.departure_station, l.arrival_station
from JOURNEY j
inner join BOOKING b on j.id = b.journey_id
inner join LEG dl on (j.id = dl.journey_id and b.departure_station = dl.departure_station)
inner join LEG al on (j.id = al.journey_id and b.arrival_station = al.arrival_station)
inner join LEG l on (j.id = l.journey_id and l.departure_time >= dl.departure_time and l.arrival_time <= al.arrival_time)
where b.id = 1
我在mariadb 12.2上工作,所以我可以访问窗口功能,但我仍然对它不太满意
谢谢
编辑:创建表:
CREATE TABLE `BOOKING` (
`id` INT(11) NOT NULL,
`journey_id` INT(11) NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `JOURNEY` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`departure_time` DATETIME NULL DEFAULT NULL,
`arrival_time` DATETIME NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `LEG` (
`id` INT(11) NOT NULL,
`journey_id` INT(11) NULL DEFAULT NULL,
`departure_station` VARCHAR(50) NULL DEFAULT NULL,
`arrival_station` VARCHAR(50) NULL DEFAULT NULL,
`departure_time` DATETIME NULL DEFAULT NULL,
`arrival_time` DATETIME NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
尝试使用left join并使用REGEXP来登记出发站和到达站
select T3.id booking_id , T1.departure_station,T1.arrival_station
from LEG T1
left join JOURNEY T2 on T1.`journey_id` = T2.`id`
and (T1.`departure_time` >= T2.`departure_datetime` and T1.`arrival_time` <= T2.`arrival_datetime`)
left join BOOKING T3 on T3.`id` = T2.`id`
and T1.departure_station REGEXP (CONCAT('[',T3.departure_station , '-' , T3.arrival_station,']' ))
and T1.arrival_station REGEXP (CONCAT('[',T3.departure_station , '-' , T3.arrival_station,']' ))
where T1.journey_id = 1 and T3.id is not null ;
测试DDL:
我不喜欢你的数据库模式。 但在您的特殊情况下,因为您的查询对您有好处。 我只需要创建几个索引来加快执行速度。 一般来说,当您需要将表连接到自身几次时,并没有什么问题 尝试只添加4个索引:
CREATE INDEX journey ON BOOKING (journey_id);
CREATE INDEX arrival ON LEG (journey_id, arrival_station);
CREATE INDEX departure ON LEG (journey_id, departure_station);
CREATE INDEX d_a_time ON LEG (journey_id, departure_time, arrival_time);
再次运行查询,使用索引时应该会快得多。我建议使用:
您给出的示例的预期结果是什么?我在问题中添加了,很抱歉遗漏了…请为每个表提供您的创建表请回答问题请检查我的答案。创建您需要的索引。为什么您认为这样更快?@Alex Asker加入三次Leg抱歉,我没有指定站点名称,但站点名称不是按字母顺序排列的,可以是任何名称。另一方面,出发时间总是与上次到达时间相同。你的回答给了我一些想法,谢谢。这是我数据库模式的一个非常轻量级的模型。实际上,LEG、travel和BOOKING中的站点只是指向站点表的外键id。谢谢你的回答,我会尝试使用那些特定的ID。我的答案不是关于ID,而是关于索引。谢谢你那些在出发和到达时丢失的索引,现在好多了。酷,我喜欢这个建议。我明天肯定会尝试,谢谢。注意:需要MySQL 8或更高版本我们在查询结束时还需要b.id=1吗?我觉得我们在CTE时就已经做到了。最后看看这种方法,我真的怀疑它是否会带来任何性能改进。对我来说,它使事情变得更慢,因为您使用类似于leg_cte的视图或临时表,而不是基本的核心表leg,这对于Mysql或MariaDB 10.2来说要好得多。
| booking_id | departure_station | arrival_station |
|------------|-------------------|-----------------|
| 1 | B | C |
| 1 | C | D |
CREATE TABLE JOURNEY
(`id` int, `departure_datetime` datetime, `arrival_datetime` datetime, `departure_station` varchar(1), `arrival_station` varchar(1))
;
INSERT INTO JOURNEY
(`id`, `departure_datetime`, `arrival_datetime`, `departure_station`, `arrival_station`)
VALUES
(1, '2018-01-01 06:00:00', '2018-01-01 10:00:00', 'A', 'E')
;
CREATE TABLE BOOKING
(`id` int, `journey_id` int, `departure_station` varchar(1), `arrival_station` varchar(1))
;
INSERT INTO BOOKING
(`id`, `journey_id`, `departure_station`, `arrival_station`)
VALUES
(1, 1, 'B', 'D')
;
CREATE TABLE LEG
(`id` int, `journey_id` int, `departure_station` varchar(1), `arrival_station` varchar(1), `departure_time` datetime, `arrival_time` datetime)
;
INSERT INTO LEG
(`id`, `journey_id`, `departure_station`, `arrival_station`, `departure_time`, `arrival_time`)
VALUES
(1, 1, 'A', 'B', '2018-01-01 06:00:00', '2018-01-01 07:00:00'),
(2, 1, 'B', 'C', '2018-01-01 07:00:00', '2018-01-01 08:00:00'),
(3, 1, 'C', 'D', '2018-01-01 08:00:00', '2018-01-01 09:00:00'),
(4, 1, 'D', 'E', '2018-01-01 09:00:00', '2018-01-01 10:00:00')
;
CREATE INDEX journey ON BOOKING (journey_id);
CREATE INDEX arrival ON LEG (journey_id, arrival_station);
CREATE INDEX departure ON LEG (journey_id, departure_station);
CREATE INDEX d_a_time ON LEG (journey_id, departure_time, arrival_time);
WITH leg_cte as
(
SELECT l.* FROM leg l
JOIN booking b
ON l.journey_id = b.journey_id
WHERE b.id = 1
)
SELECT
b.id as booking,
l.departure_station,
l.arrival_station
FROM
booking b
JOIN leg_cte dl
ON b.departure_station = dl.departure_station
JOIN leg_cte al
ON b.arrival_station = al.arrival_station
JOIN leg_cte l
ON l.departure_time >= dl.departure_time AND l.arrival_time <= al.arrival_time
WHERE b.id = 1