Postgresql 为什么有两个条件的sql查询需要很长时间?
我有一张有1200万条记录的桌子。列col_1和col_2上有索引。我使用postgresql 9.3。 我需要两种类型的查询。首先,有些查询在where子句中只有一个条件,例如:Postgresql 为什么有两个条件的sql查询需要很长时间?,postgresql,Postgresql,我有一张有1200万条记录的桌子。列col_1和col_2上有索引。我使用postgresql 9.3。 我需要两种类型的查询。首先,有些查询在where子句中只有一个条件,例如: select count(*) from table_1 where col_1 >= 123456; 解释分析: @克雷格 Aggregate (cost=164523.60..164523.61 rows=1 width=0) (actual time=1803.281..1803.281 r
select count(*)
from table_1
where
col_1 >= 123456;
解释分析:
@克雷格
Aggregate (cost=164523.60..164523.61 rows=1 width=0) (actual time=1803.281..1803.281 rows=1 loops=1)
-> Index Only Scan using table1_col1_idx on table_1 (cost=0.43..151242.20 rows=5312558 width=0) (actual time=60.713..1344.393 rows=5318333 loops=1)
Index Cond: (col_1 >= 123456)
Heap Fetches: 0
Total runtime: 1803.330 ms
还有一个查询,如:
select count(*)
from table_1
where
col_2 >= 987654;
解释分析:
Aggregate (cost=364134.66..364134.67 rows=1 width=0) (actual time=3935.708..3935.708 rows=1 loops=1)
-> Index Only Scan using table1_col2_idx on table_1 (cost=0.43..334739.38 rows=11758111 width=0) (actual time=7.521..2904.569 rows=11760285 loops=1)
Index Cond: (col_2 >= 987654)
Heap Fetches: 0
Total runtime: 3935.760 ms
-> Seq Scan on table_1 (cost=0.00..650822.93 rows=5295377 width=0) (actual time=0.056..45445.711 rows=5301622 loops=1)
Filter: ((col_2 >= 987654) AND (col_1 >= 123456))
Rows Removed by Filter: 6494640
Total runtime: 45961.622 ms
但是,问题是组合where子句的运行时间很长:当两个或多个条件与AND/or组合时。例如:
select count(*)
from table_1
where
col_1 >= 123456; AND col_2 >= 987654;
解释分析:
Aggregate (cost=364134.66..364134.67 rows=1 width=0) (actual time=3935.708..3935.708 rows=1 loops=1)
-> Index Only Scan using table1_col2_idx on table_1 (cost=0.43..334739.38 rows=11758111 width=0) (actual time=7.521..2904.569 rows=11760285 loops=1)
Index Cond: (col_2 >= 987654)
Heap Fetches: 0
Total runtime: 3935.760 ms
-> Seq Scan on table_1 (cost=0.00..650822.93 rows=5295377 width=0) (actual time=0.056..45445.711 rows=5301622 loops=1)
Filter: ((col_2 >= 987654) AND (col_1 >= 123456))
Rows Removed by Filter: 6494640
Total runtime: 45961.622 ms
这是不可接受的:3秒对45秒!那么,有什么解决方案可以改进这种组合查询呢?如何修改此查询以强制planner在列1和列2上使用索引
我还试过:
设置enable_seqscan=false
然后,规划器将其搜索计划修改为位图扫描;这将导致运行时间=137秒
Aggregate (cost=666246.28..666246.29 rows=1 width=0) (actual time=137311.964..137311.964 rows=1 loops=1)
-> Bitmap Heap Scan on table_1 (cost=99440.46..653007.83 rows=5295377 width=0) (actual time=1105.153..136527.723 rows=5301622 loops=1)
Recheck Cond: (col_1 >= 123456)
Filter: (col_2 >= 987654)
Rows Removed by Filter: 16711
-> Bitmap Index Scan on table1_col1_idx (cost=0.00..98116.62 rows=5312558 width=0) (actual time=862.677..862.677 rows=5318333 loops=1)
Index Cond: (col_1 >= 123456)
Total runtime: 137314.450 ms
假设col_1和col_2上有索引,它可能只对单个查询执行索引扫描,但必须查找表中的其他查询,或者如果数据窗口太大(可能),只对组合查询执行全表扫描请参见他“询问更好的问题”然后适当地编辑你的问题。@Bohemian可能是位图索引扫描,但我们不应该猜测-这个问题应该包括
explain analyze
输出和PostgreSQL版本。尽管如此,+1实际上包含了查询……您是否尝试了(col_1,col_2)
上的组合索引?