MySQL:子查询优化问题,其中子查询检查超过14000行
我需要帮助来优化下面的子查询。简言之,我有以下查询,其中树表根据子查询条件在s_id和分支表的最大时间戳上连接分支表 我对这个查询返回的结果很满意。但是,这个查询非常慢。瓶颈是检查14000多行的从属子QueryBranch2。如何优化子查询以加速此查询MySQL:子查询优化问题,其中子查询检查超过14000行,mysql,subquery,query-optimization,query-performance,groupwise-maximum,Mysql,Subquery,Query Optimization,Query Performance,Groupwise Maximum,我需要帮助来优化下面的子查询。简言之,我有以下查询,其中树表根据子查询条件在s_id和分支表的最大时间戳上连接分支表 我对这个查询返回的结果很满意。但是,这个查询非常慢。瓶颈是检查14000多行的从属子QueryBranch2。如何优化子查询以加速此查询 SELECT * FROM dept.tree tree LEFT JOIN dept.branch branch ON tree.s_id = branch.s_id
SELECT *
FROM dept.tree tree
LEFT JOIN dept.branch branch ON tree.s_id = branch.s_id
AND branch.timestamp =
(
SELECT MAX(timestamp)
FROM dept.branch branch2
WHERE branch2.s_id = tree.s_id
AND branch2.timestamp <= tree.timestamp
)
WHERE tree.timestamp BETWEEN CONVERT_TZ('2020-05-16 00:00:00', 'America/Toronto', 'UTC')
AND CONVERT_TZ('2020-05-16 23:59:59', 'America/Toronto', 'UTC')
AND tree.s_id IN ('459','460')
ORDER BY tree.timestamp ASC;
表分支:
id box_id timestamp data
373001345 1 2020-05-07 06:00:20 {"R": 0.114, "H": 20.808}
373001395 1 2020-05-07 06:02:26 {"R": 0.12, "H": 15.544}
373001462 1 2020-05-07 06:03:01 {"R": 0.006, "H": 55.469}
373001494 1 2020-05-07 06:04:38 {"R": 0.004, "H": 51.85}
373001496 1 2020-05-07 06:05:18 {"R": 0.02, "H": 5.8965}
373001497 1 2020-05-07 06:06:39 {"R": 0.12, "H": 54.32}
373001510 2 2020-05-07 06:07:09 {"R": 0.34, "H": 1.32}
373001511 2 2020-05-07 06:07:29 {"R": 0.56, "H": 32.7}
分支具有s_id和时间戳索引
我使用的是5.7.25-google-log版本
解释如下:
id select_type table partitions type possible_keys key key_len ref rows filtered Extra
1 PRIMARY tree range unique_timestamp_s_id,idx_s_id_timestamp,idx_timestamp idx_s_id_timestamp 10 2629 100.00 Using index condition; Using filesort
1 PRIMARY branch ref unique_timestamp_s_id,idx_timestamp unique_timestamp_s_id 5 func 1 100.00 Using where
2 DEPENDENT SUBQUERY branch2 ref unique_timestamp_s_id,idx_s_id_timestamp,idx_timestamp idx_s_id_timestamp 5 tree.s_id 14122 33.33 Using where; Using index
请提供“显示创建表”
分支需要INDEXs_id和时间戳
你需要留下吗?这可能会无缘无故地减慢查询速度
一个色谱柱上的IN和另一个色谱柱上的IN之间的组合可能未得到很好的优化;你正在运行哪个版本
请提供解释选择,以便我们讨论它是否得到了很好的优化。如果不是,我们可以讨论如何将变体中的转换为联合
这实际上可能比我上面所想的方法要快
使用上面的索引,然后重写查询:
SELECT b.*
FROM ( SELECT s_id,
MAX(timestamp) as timestamp
FROM dept.branch
WHERE timestamp BETWEEN
CONVERT_TZ('2020-05-16 00:00:00', 'America/Toronto', 'UTC')
AND CONVERT_TZ('2020-05-16 23:59:59', 'America/Toronto', 'UTC')
AND s_id IN ('459','460')
) AS x
JOIN dept.branch AS b USING(s_id, timestamp)
首先,看看这是否得到了正确的信息。然后,如果您需要帮助,我将在子查询中解释如何进行联合。这将更快:
select
tree.s_id, tree.timestamp, branch.data
from
(
SELECT
tree.s_id, tree.timestamp, max(branch.timestamp) as max_branch_timestamp
FROM
dept.tree tree
LEFT JOIN dept.branch branch
ON(
branch.s_id = tree.s_id
and branch.timestamp <= tree.timestamp
)
WHERE
tree.timestamp BETWEEN
CONVERT_TZ('2020-05-16 00:00:00', 'America/Toronto', 'UTC') AND
CONVERT_TZ('2020-05-16 23:59:59', 'America/Toronto', 'UTC')
AND tree.s_id IN ('459','460')
group by tree.s_id, tree.timestamp
) tree
left outer join branch
on(
branch.s_id = tree.s_id
and branch.timestamp = tree.max_branch_timestamp
)
示例数据、期望的结果以及对您想要实现的逻辑的解释都会有所帮助。或者,请稍等几分钟。MySQL的确切版本是什么?这是groupwise max?我在上面添加了更多信息。我很抱歉没有提前添加它们。如果我需要提供更多细节,请告诉我。谢谢,我提供了更多的信息,如果有帮助,请告诉我。我很抱歉没有早点提供。我对重新分级您的查询有疑问,您是否加入dept.tree而不是dept.branch?上面的查询给出以下错误:错误代码:1140。在没有分组依据的聚合查询中,选择列表的表达式1包含未聚合的列“dept.s_id”;这与sql\u mode=only\u full\u group\u byOops不兼容。我需要再看一次
select
tree.s_id, tree.timestamp, branch.data
from
(
SELECT
tree.s_id, tree.timestamp, max(branch.timestamp) as max_branch_timestamp
FROM
dept.tree tree
LEFT JOIN dept.branch branch
ON(
branch.s_id = tree.s_id
and branch.timestamp <= tree.timestamp
)
WHERE
tree.timestamp BETWEEN
CONVERT_TZ('2020-05-16 00:00:00', 'America/Toronto', 'UTC') AND
CONVERT_TZ('2020-05-16 23:59:59', 'America/Toronto', 'UTC')
AND tree.s_id IN ('459','460')
group by tree.s_id, tree.timestamp
) tree
left outer join branch
on(
branch.s_id = tree.s_id
and branch.timestamp = tree.max_branch_timestamp
)