Postgresql Postgres请求的日期太长
我正在使用postgres,我有一个需要花费太多时间的请求 请求如下:Postgresql Postgres请求的日期太长,postgresql,Postgresql,我正在使用postgres,我有一个需要花费太多时间的请求 请求如下: select mytable.id as id1_432_ from mytable mytable cross join mytable2 mytable2 where mytable.moLine_id=mytable2.id and mytable2.realline_id = 570 and mytable.type='DONE' and mytable.effectiveAt<='2020-01-0
select mytable.id as id1_432_
from mytable mytable
cross join mytable2 mytable2
where mytable.moLine_id=mytable2.id
and mytable2.realline_id = 570
and mytable.type='DONE'
and mytable.effectiveAt<='2020-01-06'
and (mytable.effectiveAt is null or mytable.effectiveAt>='2020-01-04')
and mytable.isCancelled=false
order by mytable.createdAt asc
limit 10;
以下是解释计划:
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.71..592.03 rows=10 width=16) (actual time=2039.456..2039.458 rows=0 loops=1)
Buffers: shared hit=1948148 read=85274
-> Nested Loop (cost=0.71..8117326.12 rows=137276 width=16) (actual time=2039.454..2039.456 rows=0 loops=1)
Buffers: shared hit=1948148 read=85274
-> Index Scan using test_index_2 on mytable mytable1 (cost=0.43..4580629.13 rows=544778 width=24) (actual time=1962.026..2036.599 rows=2044 loops=1)
Index Cond: (((type)::text = 'DONE'::text) AND (iscancelled = false) AND (effectiveat <= '2020-01-06 00:00:00'::timestamp without time zone))
Filter: ((effectiveat IS NULL) OR (effectiveat >= '2020-01-04 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 2963978
Buffers: shared hit=1942021 read=85269
-> Index Scan using sys_c001197805 on mytable2 mytable2 (cost=0.28..6.49 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=2044)
Index Cond: (id = mytable1.moline_id)
Filter: (realline_id = 570)
Rows Removed by Filter: 1
Buffers: shared hit=6127 read=5
Planning Time: 0.548 ms
Execution Time: 2039.513 ms
如何通过添加一个好的索引来优化此请求?它认为可以找到137276行,但实际上找到了0。因此,它喜欢按照createdAt的顺序读取索引,这样就可以避免对137276行进行排序,并且可以提前停止(它认为在10/137276次扫描之后)。但这是非常错误的,因为它必须扫描满足
type='DONE'和(iscancelled=false)
的整个索引块,这是索引的大部分。避免的排序实际上是0行,这不需要足够的时间来避免
更好的索引是交换createdAt和effectiveAt列的顺序。这样,它可以应用effectivAt条件来限制需要扫描的索引部分。但要使其达到最佳效果,您必须去掉无用的…为NULL或…
,因为PostgreSQL不够聪明,无法丢弃它。此外,如果两个索引都有,错误的行计数估计可能仍然会导致它选择错误的一个
另一个改进是在索引末尾添加moline_id。通过这种方式,您可以得到一个仅索引的扫描,这将避免大量随机IO以(可能是)随机IO的方式查找表行。这可以改善情况,而不依赖于是否交换其他列的顺序,如果表被很好地清空,仅此一项就足以使查询足够快。请添加
解释(分析,缓冲区)
输出,解释计划包含一个查询中不存在的限制。为空或在effectiveAt='2020-01-04')
中是多余的。如果它符合@LaurenzAlbe,我将用analyze更改解释buffers@Jeremy我没有添加限制,但它存在,我修改了我的查询
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.71..592.03 rows=10 width=16) (actual time=2039.456..2039.458 rows=0 loops=1)
Buffers: shared hit=1948148 read=85274
-> Nested Loop (cost=0.71..8117326.12 rows=137276 width=16) (actual time=2039.454..2039.456 rows=0 loops=1)
Buffers: shared hit=1948148 read=85274
-> Index Scan using test_index_2 on mytable mytable1 (cost=0.43..4580629.13 rows=544778 width=24) (actual time=1962.026..2036.599 rows=2044 loops=1)
Index Cond: (((type)::text = 'DONE'::text) AND (iscancelled = false) AND (effectiveat <= '2020-01-06 00:00:00'::timestamp without time zone))
Filter: ((effectiveat IS NULL) OR (effectiveat >= '2020-01-04 00:00:00'::timestamp without time zone))
Rows Removed by Filter: 2963978
Buffers: shared hit=1942021 read=85269
-> Index Scan using sys_c001197805 on mytable2 mytable2 (cost=0.28..6.49 rows=1 width=8) (actual time=0.001..0.001 rows=0 loops=2044)
Index Cond: (id = mytable1.moline_id)
Filter: (realline_id = 570)
Rows Removed by Filter: 1
Buffers: shared hit=6127 read=5
Planning Time: 0.548 ms
Execution Time: 2039.513 ms
select count(*) from mytable ;
count
---------
3652331
# select count(*) from mytable2;
count
-------
5417