Postgresql 涉及索引时成本错误的执行计划_Postgresql_Sql Execution Plan_Database Indexes

Postgresql 涉及索引时成本错误的执行计划

postgresql

Postgresql 涉及索引时成本错误的执行计划,postgresql,sql-execution-plan,database-indexes,Postgresql,Sql Execution Plan,Database Indexes,有谁能帮我理解为什么postgresql遗漏了几个成本估算我正在对来自TPCH基准测试[1]的22个查询进行实验，检查查询中索引的性能在22个查询中，只有5个查询验证优化器使用了辅助索引。在另一个实验中，引用的5个查询是在没有索引的数据库中执行的，以确定执行时间是否会因为没有索引而增加但令我惊讶的是，在数据库中不存在索引的情况下，这个实验比使用索引（对于22个查询）要快我想了解，因为total cost参数在所有情况下都是错误的，即所有花费时间最多的查询都表明成本较低——在所有5种情

有谁能帮我理解为什么postgresql遗漏了几个成本估算

我正在对来自TPCH基准测试[1]的22个查询进行实验，检查查询中索引的性能

在22个查询中，只有5个查询验证优化器使用了辅助索引。在另一个实验中，引用的5个查询是在没有索引的数据库中执行的，以确定执行时间是否会因为没有索引而增加

但令我惊讶的是，在数据库中不存在索引的情况下，这个实验比使用索引（对于22个查询）要快

我想了解，因为total cost参数在所有情况下都是错误的，即所有花费时间最多的查询都表明成本较低——在所有5种情况下，我认为这是错误的

请参见下文，第一行是指您使用的查询6 指数中，成本为3335809，低于成本5255959，相同未使用索引的查询6

也看看花了多少时间。使用索引的查询花费了7分钟，而没有索引的查询花费了55秒。这种行为扩展到其他情况

我的问题是，对于所有有索引的情况，为什么总成本（执行计划）计算不正确

索引|查询|花费的时间|总成本
============================================
Sim卡6 00:07:56 3335809.61
Nao 6 00:00:55 5255959.00
Sim卡7 00:09:16 5847359.97
Nao 7 00:02:08 6793148.45
Sim卡10 00:07:04 40743017.17
Nao 10 00:02:14 41341406.62
Sim卡15 00:10:03 6431359.90
Nao 15 00:01:56 9608659.87
Sim卡20 00:12:48 8412159.69
Nao 20 00:05:49 9537835.93

按照查询6及其解释进行分析（带索引和不带索引）

select
    sum(l_extendedprice * l_discount) as revenue
from
    lineitem
where
    l_shipdate >= date '1995-01-01'
    and l_shipdate < date '1995-01-01' + interval '1' year
    and l_discount between 0.09 - 0.01 and 0.09 + 0.01
    and l_quantity < 24; `     


-=--========= With INDEX  (idx_l_shipdatelineitem ) -=-=-====== 
Plano Execução: Aggregate  (cost=3335809.59..3335809.61 rows=1    width=16) (actual time=476033.847..476033.847 rows=1 loops=1)
 ->  Bitmap Heap Scan on lineitem  (cost=376416.20..3330284.29 rows=2210122 width=16) (actual time=375293.183..471695.391 rows=2282333 loops=1)
       Recheck Cond: ((l_shipdate >= _1995-01-01_::date) AND    (l_shipdate < _1996-01-01 00:00:00_::timestamp without time zone))
        Filter: ((l_discount >= 0.08) AND (l_discount <= 0.10) AND (l_quantity < 24::numeric))
        ->  Bitmap Index Scan on idx_l_shipdatelineitem000 (cost=0.00..375863.67 rows=17925026 width=0) (actual time=375289.456..375289.456 rows=18211743 loops=1)   
           Index Cond: ((l_shipdate >= _1995-01-01_::date) AND (l_shipdate < _1996-01-01 00:00:00_::timestamp without time zone))
Total runtime: 476034.306 ms

[1] www.tpc.org/tpch/

为什么您认为成本“不正确”？我假设OP询问为什么索引成本是无索引成本的两倍，但比无索引花费的时间长8倍-对吗？请更新OP：上次真空（或自动真空）的时间tmp、postgres版本、ram数量和有效的缓存大小，随机页面扫描成本和所有非默认设置Dear Vao Tsun，我在主要问题中更新了此参数，请参见上文。是的“一匹有没有名字的马”，这就是问题：为什么有索引的成本是没有索引的成本的两倍，但比没有索引的成本要长8倍。有什么建议吗？

Plano Execucao:Aggregate  (cost=5255958.99..5255959.00 rows=1 width=16) (actual time=55051.051..55051.051 rows=1 loops=1)
  ->  Seq Scan on lineitem  (cost=0.00..5250433.68 rows=2210122 width=16) (actual time=0.394..52236.276 rows=2282333 loops=1)
        Filter: ((l_shipdate >= _1995-01-01_::date) AND (l_shipdate < _1996-01-01 00:00:00_::timestamp without time zone)
        AND (l_discount >= 0.08) AND (l_discount <= 0.10) AND (l_quantity < 24::numeric))Total runtime: 55051.380

max_connections = 100 
effective_io_concurrency = 5 
#seq_page_cost = 1.0                    
random_page_cost = 1.0                  
#cpu_tuple_cost = 0.01                  
#cpu_index_tuple_cost = 0.005        
#cpu_operator_cost = 0.0025             
#effective_cache_size = 128MB