PostgreSQL。低计数(*)-选择具有附加左连接的性能

PostgreSQL。低计数(*)-选择具有附加左连接的性能,sql,postgresql,count,database-performance,Sql,Postgresql,Count,Database Performance,我有以下选择: SELECT count(*) AS y0_ FROM erc.SUBJECTS this_ LEFT OUTER JOIN fias.FIAS_HOUSE factaddres4_ ON this_.FACTADDRESS_REF = factaddres4_.houseId LEFT OUTER JOIN fias.FIAS_AGREGATE_ADDRESS factaddres5_ ON factaddres4_.houseId = factaddres5_.HOUS

我有以下选择:

SELECT count(*) AS y0_
FROM erc.SUBJECTS this_ 
LEFT OUTER JOIN fias.FIAS_HOUSE factaddres4_ ON this_.FACTADDRESS_REF = factaddres4_.houseId
  LEFT OUTER JOIN fias.FIAS_AGREGATE_ADDRESS factaddres5_ ON factaddres4_.houseId = factaddres5_.HOUSEID
  LEFT OUTER JOIN erc.REFITEMS okopf_1_ ON this_.OKOPF_REF = okopf_1_.ID
WHERE this_.IS_ACTUAL = 1 AND this_.IS_DELETE <> 1 AND NOT okopf_1_.CODE LIKE '5%' AND NOT okopf_1_.CODE = '0'
它运行了将近18秒

主题表有376k行,fias_house有2100万行,fias_agregate_地址-130。 解释分析结果:

Aggregate  (cost=1061561.33..1061561.34 rows=1 width=4) (actual time=17813.460..17813.460 rows=1 loops=1)
  ->  Hash Left Join  (cost=106687.31..1060683.61 rows=351088 width=4) (actual time=763.556..17741.820 rows=376196 loops=1)
        Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
        ->  Hash Join  (cost=106679.25..1059358.95 rows=351088 width=41) (actual time=760.772..17599.742 rows=376196 loops=1)
              Hash Cond: (this_.okopf_ref = okopf_1_.id)
              ->  Merge Right Join  (cost=106599.85..1053887.84 rows=376166 width=45) (actual time=759.211..17411.313 rows=376254 loops=1)
                    Merge Cond: ((factaddres4_.houseid)::text = (this_.factaddress_ref)::text)
                    ->  Index Only Scan using fias_house_pkey on fias_house factaddres4_  (cost=0.56..924229.05 rows=21084566 width=37) (actual time=0.013..8528.487 rows=19627484 loops=1)
                          Heap Fetches: 0
                    ->  Materialize  (cost=74125.25..76006.08 rows=376166 width=45) (actual time=759.171..980.286 rows=376254 loops=1)
                          ->  Sort  (cost=74125.25..75065.67 rows=376166 width=45) (actual time=759.167..863.495 rows=376254 loops=1)
                                Sort Key: this_.factaddress_ref
                                Sort Method: external sort  Disk: 6616kB
                                ->  Seq Scan on subjects this_  (cost=0.00..27715.88 rows=376166 width=45) (actual time=0.790..591.380 rows=376254 loops=1)
                                      Filter: ((is_delete <> 1) AND (is_actual = 1))
                                      Rows Removed by Filter: 138
              ->  Hash  (cost=53.85..53.85 rows=2044 width=4) (actual time=1.522..1.522 rows=2051 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 49kB
                    ->  Seq Scan on refitems okopf_1_  (cost=0.00..53.85 rows=2044 width=4) (actual time=0.019..0.930 rows=2051 loops=1)
                          Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
                          Rows Removed by Filter: 139
        ->  Hash  (cost=6.36..6.36 rows=136 width=37) (actual time=2.761..2.761 rows=136 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 8kB
              ->  Seq Scan on fias_agregate_address factaddres5_  (cost=0.00..6.36 rows=136 width=37) (actual time=1.477..2.696 rows=136 loops=1)
Total runtime: 17814.728 ms
Aggregate  (cost=34066.40..34066.41 rows=1 width=4) (actual time=510.291..510.292 rows=1 loops=1)
  ->  Hash Join  (cost=79.40..33188.44 rows=351183 width=4) (actual time=1.573..442.526 rows=376196 loops=1)
        Hash Cond: (this_.okopf_ref = okopf_1_.id)
        ->  Seq Scan on subjects this_  (cost=0.00..27715.88 rows=376267 width=45) (actual time=0.144..248.430 rows=376254 loops=1)
              Filter: ((is_delete <> 1) AND (is_actual = 1))
              Rows Removed by Filter: 138
        ->  Hash  (cost=53.85..53.85 rows=2044 width=4) (actual time=1.415..1.415 rows=2051 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 49kB
              ->  Seq Scan on refitems okopf_1_  (cost=0.00..53.85 rows=2044 width=4) (actual time=0.007..0.844 rows=2051 loops=1)
                    Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
                    Rows Removed by Filter: 139
Total runtime: 510.367 ms
在不加入FIAS的情况下,AGREGATE地址请求将在更充足的时间内完成。解释分析结果:

Aggregate  (cost=1061561.33..1061561.34 rows=1 width=4) (actual time=17813.460..17813.460 rows=1 loops=1)
  ->  Hash Left Join  (cost=106687.31..1060683.61 rows=351088 width=4) (actual time=763.556..17741.820 rows=376196 loops=1)
        Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
        ->  Hash Join  (cost=106679.25..1059358.95 rows=351088 width=41) (actual time=760.772..17599.742 rows=376196 loops=1)
              Hash Cond: (this_.okopf_ref = okopf_1_.id)
              ->  Merge Right Join  (cost=106599.85..1053887.84 rows=376166 width=45) (actual time=759.211..17411.313 rows=376254 loops=1)
                    Merge Cond: ((factaddres4_.houseid)::text = (this_.factaddress_ref)::text)
                    ->  Index Only Scan using fias_house_pkey on fias_house factaddres4_  (cost=0.56..924229.05 rows=21084566 width=37) (actual time=0.013..8528.487 rows=19627484 loops=1)
                          Heap Fetches: 0
                    ->  Materialize  (cost=74125.25..76006.08 rows=376166 width=45) (actual time=759.171..980.286 rows=376254 loops=1)
                          ->  Sort  (cost=74125.25..75065.67 rows=376166 width=45) (actual time=759.167..863.495 rows=376254 loops=1)
                                Sort Key: this_.factaddress_ref
                                Sort Method: external sort  Disk: 6616kB
                                ->  Seq Scan on subjects this_  (cost=0.00..27715.88 rows=376166 width=45) (actual time=0.790..591.380 rows=376254 loops=1)
                                      Filter: ((is_delete <> 1) AND (is_actual = 1))
                                      Rows Removed by Filter: 138
              ->  Hash  (cost=53.85..53.85 rows=2044 width=4) (actual time=1.522..1.522 rows=2051 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 49kB
                    ->  Seq Scan on refitems okopf_1_  (cost=0.00..53.85 rows=2044 width=4) (actual time=0.019..0.930 rows=2051 loops=1)
                          Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
                          Rows Removed by Filter: 139
        ->  Hash  (cost=6.36..6.36 rows=136 width=37) (actual time=2.761..2.761 rows=136 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 8kB
              ->  Seq Scan on fias_agregate_address factaddres5_  (cost=0.00..6.36 rows=136 width=37) (actual time=1.477..2.696 rows=136 loops=1)
Total runtime: 17814.728 ms
Aggregate  (cost=34066.40..34066.41 rows=1 width=4) (actual time=510.291..510.292 rows=1 loops=1)
  ->  Hash Join  (cost=79.40..33188.44 rows=351183 width=4) (actual time=1.573..442.526 rows=376196 loops=1)
        Hash Cond: (this_.okopf_ref = okopf_1_.id)
        ->  Seq Scan on subjects this_  (cost=0.00..27715.88 rows=376267 width=45) (actual time=0.144..248.430 rows=376254 loops=1)
              Filter: ((is_delete <> 1) AND (is_actual = 1))
              Rows Removed by Filter: 138
        ->  Hash  (cost=53.85..53.85 rows=2044 width=4) (actual time=1.415..1.415 rows=2051 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 49kB
              ->  Seq Scan on refitems okopf_1_  (cost=0.00..53.85 rows=2044 width=4) (actual time=0.007..0.844 rows=2051 loops=1)
                    Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
                    Rows Removed by Filter: 139
Total runtime: 510.367 ms
我发现这篇文章: 但我不能使用这些建议,因为搜索条件可能会有所不同

我也不能扔掉FIAS_AGREGATE_地址加入,因为那个表上可能有搜索条件

也许有一些聪明的指数或其他机会,我错过了,因为疲劳和愚蠢

UPD:将工时从8MB增加到16 MB后,解释分析结果为:

Aggregate  (cost=1018467.07..1018467.08 rows=1 width=4) (actual time=18615.975..18615.975 rows=1 loops=1)
  ->  Hash Left Join  (cost=810328.24..1017589.11 rows=351183 width=4) (actual time=3.609..18543.596 rows=376196 loops=1)
        Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
        ->  Hash Join  (cost=810320.18..1016264.10 rows=351183 width=41) (actual time=2.190..18400.383 rows=376196 loops=1)
              Hash Cond: (this_.okopf_ref = okopf_1_.id)
              ->  Merge Left Join  (cost=810240.78..1010791.53 rows=376267 width=45) (actual time=0.838..18203.533 rows=376254 loops=1)
                    Merge Cond: ((this_.factaddress_ref)::text = (factaddres4_.houseid)::text)
                    ->  Index Scan using idx_subjects_factaddress_ref_btree on subjects this_  (cost=0.42..32907.70 rows=376267 width=45) (actual time=0.805..701.428 rows=376254 loops=1)
                    ->  Index Only Scan using fias_house_pkey on fias_house factaddres4_  (cost=0.56..924231.15 rows=21084706 width=37) (actual time=0.013..8885.002 rows=19627486 loops=1)
                          Heap Fetches: 0
              ->  Hash  (cost=53.85..53.85 rows=2044 width=4) (actual time=1.307..1.307 rows=2051 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 49kB
                    ->  Seq Scan on refitems okopf_1_  (cost=0.00..53.85 rows=2044 width=4) (actual time=0.010..0.802 rows=2051 loops=1)
                          Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
                          Rows Removed by Filter: 139
        ->  Hash  (cost=6.36..6.36 rows=136 width=37) (actual time=1.396..1.396 rows=136 loops=1)
              Buckets: 1024  Batches: 1  Memory Usage: 8kB
              ->  Seq Scan on fias_agregate_address factaddres5_  (cost=0.00..6.36 rows=136 width=37) (actual time=0.782..1.323 rows=136 loops=1)
Total runtime: 18616.060 ms
排序行消失了,但请求时间实际上没有受到影响

每个联接都有外键。到处映射的列都是私钥。我的意思是,例如,SUBJECTS表有FK:OKOPF_REF->REFITEMS.ID,ID是REFITEMS中的私钥列

下面是ddl上的链接,包括这些表的索引:


为了更好地分析,我发布了修剪查询,但可能有不同的搜索条件,比如在不同的表中搜索子字符串。我有一个最坏的情况:对于像“123”这样的简单搜索字符串,应该在所有表上执行所有连接搜索,但计数结果仍然非常大。因此,我不能忽略那些左连接。

注意,当进行外部连接时,将外部表的条件(而不是okopf_1_.CODE等)放在ON子句中,而不是WHERE子句中。否则它只是一个普通的内部联接。@jarlh谢谢,我将在应用程序中更改为内部联接,但它不会更改任何内容。排序方法:外部排序磁盘:6616kB,work_mem设置得太低,请将其更改为至少10MB,然后重试。应该避免磁盘排序,因为它很慢。您希望从三个左联接表{FIAS_HOUSE factaddres4_,FIAS_AGREGATE_ADDRESS,REFITEMS}中的每{SUBJECTS}行有多少匹配行,如果这永远不会超过一行,左连接不会影响总计数,除了上面@jarlh评论的内容之外,将部分架构添加到问题中肯定会有帮助。注意:鉴于左连接是由{sort+sort+merge}执行的,我怀疑表之间没有可用的FK->PK耦合或索引。