PostgreSQL。低计数(*)-选择具有附加左连接的性能
我有以下选择:PostgreSQL。低计数(*)-选择具有附加左连接的性能,sql,postgresql,count,database-performance,Sql,Postgresql,Count,Database Performance,我有以下选择: SELECT count(*) AS y0_ FROM erc.SUBJECTS this_ LEFT OUTER JOIN fias.FIAS_HOUSE factaddres4_ ON this_.FACTADDRESS_REF = factaddres4_.houseId LEFT OUTER JOIN fias.FIAS_AGREGATE_ADDRESS factaddres5_ ON factaddres4_.houseId = factaddres5_.HOUS
SELECT count(*) AS y0_
FROM erc.SUBJECTS this_
LEFT OUTER JOIN fias.FIAS_HOUSE factaddres4_ ON this_.FACTADDRESS_REF = factaddres4_.houseId
LEFT OUTER JOIN fias.FIAS_AGREGATE_ADDRESS factaddres5_ ON factaddres4_.houseId = factaddres5_.HOUSEID
LEFT OUTER JOIN erc.REFITEMS okopf_1_ ON this_.OKOPF_REF = okopf_1_.ID
WHERE this_.IS_ACTUAL = 1 AND this_.IS_DELETE <> 1 AND NOT okopf_1_.CODE LIKE '5%' AND NOT okopf_1_.CODE = '0'
它运行了将近18秒
主题表有376k行,fias_house有2100万行,fias_agregate_地址-130。
解释分析结果:
Aggregate (cost=1061561.33..1061561.34 rows=1 width=4) (actual time=17813.460..17813.460 rows=1 loops=1)
-> Hash Left Join (cost=106687.31..1060683.61 rows=351088 width=4) (actual time=763.556..17741.820 rows=376196 loops=1)
Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
-> Hash Join (cost=106679.25..1059358.95 rows=351088 width=41) (actual time=760.772..17599.742 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Merge Right Join (cost=106599.85..1053887.84 rows=376166 width=45) (actual time=759.211..17411.313 rows=376254 loops=1)
Merge Cond: ((factaddres4_.houseid)::text = (this_.factaddress_ref)::text)
-> Index Only Scan using fias_house_pkey on fias_house factaddres4_ (cost=0.56..924229.05 rows=21084566 width=37) (actual time=0.013..8528.487 rows=19627484 loops=1)
Heap Fetches: 0
-> Materialize (cost=74125.25..76006.08 rows=376166 width=45) (actual time=759.171..980.286 rows=376254 loops=1)
-> Sort (cost=74125.25..75065.67 rows=376166 width=45) (actual time=759.167..863.495 rows=376254 loops=1)
Sort Key: this_.factaddress_ref
Sort Method: external sort Disk: 6616kB
-> Seq Scan on subjects this_ (cost=0.00..27715.88 rows=376166 width=45) (actual time=0.790..591.380 rows=376254 loops=1)
Filter: ((is_delete <> 1) AND (is_actual = 1))
Rows Removed by Filter: 138
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.522..1.522 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.019..0.930 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
-> Hash (cost=6.36..6.36 rows=136 width=37) (actual time=2.761..2.761 rows=136 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on fias_agregate_address factaddres5_ (cost=0.00..6.36 rows=136 width=37) (actual time=1.477..2.696 rows=136 loops=1)
Total runtime: 17814.728 ms
Aggregate (cost=34066.40..34066.41 rows=1 width=4) (actual time=510.291..510.292 rows=1 loops=1)
-> Hash Join (cost=79.40..33188.44 rows=351183 width=4) (actual time=1.573..442.526 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Seq Scan on subjects this_ (cost=0.00..27715.88 rows=376267 width=45) (actual time=0.144..248.430 rows=376254 loops=1)
Filter: ((is_delete <> 1) AND (is_actual = 1))
Rows Removed by Filter: 138
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.415..1.415 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.007..0.844 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
Total runtime: 510.367 ms
在不加入FIAS的情况下,AGREGATE地址请求将在更充足的时间内完成。解释分析结果:
Aggregate (cost=1061561.33..1061561.34 rows=1 width=4) (actual time=17813.460..17813.460 rows=1 loops=1)
-> Hash Left Join (cost=106687.31..1060683.61 rows=351088 width=4) (actual time=763.556..17741.820 rows=376196 loops=1)
Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
-> Hash Join (cost=106679.25..1059358.95 rows=351088 width=41) (actual time=760.772..17599.742 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Merge Right Join (cost=106599.85..1053887.84 rows=376166 width=45) (actual time=759.211..17411.313 rows=376254 loops=1)
Merge Cond: ((factaddres4_.houseid)::text = (this_.factaddress_ref)::text)
-> Index Only Scan using fias_house_pkey on fias_house factaddres4_ (cost=0.56..924229.05 rows=21084566 width=37) (actual time=0.013..8528.487 rows=19627484 loops=1)
Heap Fetches: 0
-> Materialize (cost=74125.25..76006.08 rows=376166 width=45) (actual time=759.171..980.286 rows=376254 loops=1)
-> Sort (cost=74125.25..75065.67 rows=376166 width=45) (actual time=759.167..863.495 rows=376254 loops=1)
Sort Key: this_.factaddress_ref
Sort Method: external sort Disk: 6616kB
-> Seq Scan on subjects this_ (cost=0.00..27715.88 rows=376166 width=45) (actual time=0.790..591.380 rows=376254 loops=1)
Filter: ((is_delete <> 1) AND (is_actual = 1))
Rows Removed by Filter: 138
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.522..1.522 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.019..0.930 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
-> Hash (cost=6.36..6.36 rows=136 width=37) (actual time=2.761..2.761 rows=136 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on fias_agregate_address factaddres5_ (cost=0.00..6.36 rows=136 width=37) (actual time=1.477..2.696 rows=136 loops=1)
Total runtime: 17814.728 ms
Aggregate (cost=34066.40..34066.41 rows=1 width=4) (actual time=510.291..510.292 rows=1 loops=1)
-> Hash Join (cost=79.40..33188.44 rows=351183 width=4) (actual time=1.573..442.526 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Seq Scan on subjects this_ (cost=0.00..27715.88 rows=376267 width=45) (actual time=0.144..248.430 rows=376254 loops=1)
Filter: ((is_delete <> 1) AND (is_actual = 1))
Rows Removed by Filter: 138
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.415..1.415 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.007..0.844 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
Total runtime: 510.367 ms
我发现这篇文章:
但我不能使用这些建议,因为搜索条件可能会有所不同
我也不能扔掉FIAS_AGREGATE_地址加入,因为那个表上可能有搜索条件
也许有一些聪明的指数或其他机会,我错过了,因为疲劳和愚蠢
UPD:将工时从8MB增加到16 MB后,解释分析结果为:
Aggregate (cost=1018467.07..1018467.08 rows=1 width=4) (actual time=18615.975..18615.975 rows=1 loops=1)
-> Hash Left Join (cost=810328.24..1017589.11 rows=351183 width=4) (actual time=3.609..18543.596 rows=376196 loops=1)
Hash Cond: ((factaddres4_.houseid)::text = (factaddres5_.houseid)::text)
-> Hash Join (cost=810320.18..1016264.10 rows=351183 width=41) (actual time=2.190..18400.383 rows=376196 loops=1)
Hash Cond: (this_.okopf_ref = okopf_1_.id)
-> Merge Left Join (cost=810240.78..1010791.53 rows=376267 width=45) (actual time=0.838..18203.533 rows=376254 loops=1)
Merge Cond: ((this_.factaddress_ref)::text = (factaddres4_.houseid)::text)
-> Index Scan using idx_subjects_factaddress_ref_btree on subjects this_ (cost=0.42..32907.70 rows=376267 width=45) (actual time=0.805..701.428 rows=376254 loops=1)
-> Index Only Scan using fias_house_pkey on fias_house factaddres4_ (cost=0.56..924231.15 rows=21084706 width=37) (actual time=0.013..8885.002 rows=19627486 loops=1)
Heap Fetches: 0
-> Hash (cost=53.85..53.85 rows=2044 width=4) (actual time=1.307..1.307 rows=2051 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 49kB
-> Seq Scan on refitems okopf_1_ (cost=0.00..53.85 rows=2044 width=4) (actual time=0.010..0.802 rows=2051 loops=1)
Filter: (((code)::text !~~ '5%'::text) AND ((code)::text <> '0'::text))
Rows Removed by Filter: 139
-> Hash (cost=6.36..6.36 rows=136 width=37) (actual time=1.396..1.396 rows=136 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 8kB
-> Seq Scan on fias_agregate_address factaddres5_ (cost=0.00..6.36 rows=136 width=37) (actual time=0.782..1.323 rows=136 loops=1)
Total runtime: 18616.060 ms
排序行消失了,但请求时间实际上没有受到影响
每个联接都有外键。到处映射的列都是私钥。我的意思是,例如,SUBJECTS表有FK:OKOPF_REF->REFITEMS.ID,ID是REFITEMS中的私钥列
下面是ddl上的链接,包括这些表的索引:
为了更好地分析,我发布了修剪查询,但可能有不同的搜索条件,比如在不同的表中搜索子字符串。我有一个最坏的情况:对于像“123”这样的简单搜索字符串,应该在所有表上执行所有连接搜索,但计数结果仍然非常大。因此,我不能忽略那些左连接。注意,当进行外部连接时,将外部表的条件(而不是okopf_1_.CODE等)放在ON子句中,而不是WHERE子句中。否则它只是一个普通的内部联接。@jarlh谢谢,我将在应用程序中更改为内部联接,但它不会更改任何内容。排序方法:外部排序磁盘:6616kB,work_mem设置得太低,请将其更改为至少10MB,然后重试。应该避免磁盘排序,因为它很慢。您希望从三个左联接表{FIAS_HOUSE factaddres4_,FIAS_AGREGATE_ADDRESS,REFITEMS}中的每{SUBJECTS}行有多少匹配行,如果这永远不会超过一行,左连接不会影响总计数,除了上面@jarlh评论的内容之外,将部分架构添加到问题中肯定会有帮助。注意:鉴于左连接是由{sort+sort+merge}执行的,我怀疑表之间没有可用的FK->PK耦合或索引。