Performance 为什么搜索词的细微变化会使查询速度减慢这么多?
我在PostgreSQL(9.5.1)中有以下查询: 从Performance 为什么搜索词的细微变化会使查询速度减慢这么多?,performance,postgresql,pattern-matching,query-performance,postgresql-performance,Performance,Postgresql,Pattern Matching,Query Performance,Postgresql Performance,我在PostgreSQL(9.5.1)中有以下查询: 从esp表中总共9250行中检索1129行需要430ms 如果我将搜索项从%vicen%更改为%vicent%(添加一个“t”),则检索相同的1129行需要431ms 按搜索列的升序和降序排序,我发现所有1129行在这两种情况下都具有完全相同的名称 现在奇怪的是:如果我将搜索词从%vicent%更改为%vicenti%(添加一个“I”),现在检索相同的1129行需要难以置信的24.4秒 搜索的词始终位于第一个合并,即合并(p.abrev.)。
esp
表中总共9250行中检索1129行需要430ms
如果我将搜索项从%vicen%
更改为%vicent%
(添加一个“t”),则检索相同的1129行需要431ms
按搜索列的升序和降序排序,我发现所有1129行在这两种情况下都具有完全相同的名称
现在奇怪的是:如果我将搜索词从%vicent%
更改为%vicenti%
(添加一个“I”),现在检索相同的1129行需要难以置信的24.4秒
搜索的词始终位于第一个合并
,即合并(p.abrev.)
。我希望查询运行得更慢或更快,这取决于搜索字符串的大小,但不会太多!!有人知道发生了什么吗
解释分析的结果(此处将超过30k字符限制):
对于%vicen%
:
对于%vicenti%
:为什么?
原因是:
快速查询:
-> Hash Left Join (cost=1378.60..2467.48 rows=15 width=79) (actual time=41.759..85.037 rows=1129 loops=1)
...
Filter: (unaccent(((((COALESCE(p.abrev, ''::character varying))::text || ' ('::text) || (COALESCE(p.prenome, ''::character varying))::text) || ')'::text)) ~~* (...)
要点
为什么f_uncent()
?因为uncent()
无法编制索引。请阅读以下内容:
CREATE INDEX pess_unaccent_nome_trgm_idx ON pess
USING gin (f_unaccent(pess) gin_trgm_ops, f_unaccent(prenome) gin_trgm_ops);
如果您不熟悉三元索引,请先阅读以下内容:
准备好的语句是使用参数(尤其是用户输入的文本)执行查询的常用方法。Postgres必须找到最适合任何给定参数的计划。将通配符作为常量添加到搜索词中,如下所示:
f_unaccent(p.abrev) ILIKE f_unaccent('%' || 'vicenti' || '%')
关于FM
模板模式修改器:
还简化了:
format('%s (%s)', p.abrev, p.prenome) AS determinador
不会使查询更快,但会更干净。看
第一件事最后,所有关于性能优化的常规建议都适用:
如果所有这些都正确,您应该可以看到对所有模式的更快的查询。减小范围表大小的一种方法是将查询的一小部分挤出到CTE中,例如:
WITH zzz AS (
SELECT l.id, l.nome
, coalesce(v.val,v.valf)||' '||vu.unit as altura
, coalesce(v1.val,v1.valf)||' '||vu1.unit as DAP
FROM loc l
left join var v on v.esp = l.id and v.key = 265
left join varunit vu on vu.id = v.unit
left join var v1 on v1.esp = l.id and v1.key = 264
left join varunit vu1 on vu1.id = v1.unit
)
select e.id, (select count(id) from imgitem ii
where ii.tabid = e.id and ii.tab = 'esp'
) as imgs
, e.ano, e.mes, e.dia
, cast(cast(e.ano as varchar(4))||'-'||right('0'||cast(e.mes as varchar(2)),2)||'-'|| right('0'||cast(e.dia as varchar(2)),2) as varchar(10)) as data
, pl.pltag, e.inpa, e.det, d.ano anodet
, coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')' determinador
, d.tax
, zzz.altura as altura
, zzz.DAP as DAP
, d.fam, tf.nome família
, d.gen, tg.nome gênero
, d.sp , ts.nome espécie
, d.inf, e.loc
, zzz.nome AS localidade
, e.lat, e.lon
from esp e
left join det d on e.det = d.id -- these could possibly be
left join pess p on p.id = d.detby -- plain joins
--
left join tax tf on d.fam = tf.oldfam
left join tax tg on d.gen = tg.oldgen
left join tax ts on d.sp = ts.oldsp
-- ### commented out, since it is never referred
-- ### left join tax ti on d.inf = ti.oldinf
left join pl on pl.id = e.pl
left JOIN zzz ON zzz.id = e.loc
--
WHERE unaccent(TEXT(coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')')) ilike unaccent('%vicen%')
;
[未测试,因为我没有表定义]此时它必须尝试使用其他索引。使用EXPLAIN向您展示每个项目的执行计划。这会重复多次吗?同样相关的还有:
show join\u collapse\u limit代码>?什么@ErwinBrandstetter说:可能范围表超出了折叠限制(12或14),而geqo kicks是。(解决方法:使用CTE挤出琐碎的子查询,如查找。或:增加限制)顺便说一句:您没有使用tileft join tax ti on d.inf=ti.oldinf
中的任何字段非常感谢您的详细回答,Erwin!我将在讲解过程中回答每一点。祝你周末愉快!为什么f_uncent()?因为在巴西,口音在人们的名字中很常见:乔奥、何塞、路易斯、马里奥。。。我在看你的策略。@Rodrigo:我想你误解了f_uncent()
uncent()
显然很有用。我只是解释为什么我使用自定义函数f_uncent()
而不是现成的uncent()
。关于细节,真的,我误解了。你的单一答案给了我几天的学习素材,呵呵,再次感谢!日期分为三列,因为有时我们只有给定记录的年份或年份+月份。这里有一些选择:——但我还不确定哪一个是最好的。
f_unaccent(p.abrev) ILIKE f_unaccent('%' || 'vicenti' || '%')
e.ano::text || to_char(e.mes2, 'FM"-"00')
|| to_char(e.dia, 'FM"-"00') AS data
format('%s (%s)', p.abrev, p.prenome) AS determinador
WITH zzz AS (
SELECT l.id, l.nome
, coalesce(v.val,v.valf)||' '||vu.unit as altura
, coalesce(v1.val,v1.valf)||' '||vu1.unit as DAP
FROM loc l
left join var v on v.esp = l.id and v.key = 265
left join varunit vu on vu.id = v.unit
left join var v1 on v1.esp = l.id and v1.key = 264
left join varunit vu1 on vu1.id = v1.unit
)
select e.id, (select count(id) from imgitem ii
where ii.tabid = e.id and ii.tab = 'esp'
) as imgs
, e.ano, e.mes, e.dia
, cast(cast(e.ano as varchar(4))||'-'||right('0'||cast(e.mes as varchar(2)),2)||'-'|| right('0'||cast(e.dia as varchar(2)),2) as varchar(10)) as data
, pl.pltag, e.inpa, e.det, d.ano anodet
, coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')' determinador
, d.tax
, zzz.altura as altura
, zzz.DAP as DAP
, d.fam, tf.nome família
, d.gen, tg.nome gênero
, d.sp , ts.nome espécie
, d.inf, e.loc
, zzz.nome AS localidade
, e.lat, e.lon
from esp e
left join det d on e.det = d.id -- these could possibly be
left join pess p on p.id = d.detby -- plain joins
--
left join tax tf on d.fam = tf.oldfam
left join tax tg on d.gen = tg.oldgen
left join tax ts on d.sp = ts.oldsp
-- ### commented out, since it is never referred
-- ### left join tax ti on d.inf = ti.oldinf
left join pl on pl.id = e.pl
left JOIN zzz ON zzz.id = e.loc
--
WHERE unaccent(TEXT(coalesce(p.abrev,'')||' ('||coalesce(p.prenome,'')||')')) ilike unaccent('%vicen%')
;