Sql Postgres三角图和排序非常慢_Sql_Postgresql_Group By

Sql Postgres三角图和排序非常慢

sql postgresql

Sql Postgres三角图和排序非常慢,sql,postgresql,group-by,Sql,Postgresql,Group By,我试图建立一个针对多个列的模糊搜索，每个列与相应搜索项之间的距离都有权重我有以下疑问： select sf_id from ( select * from ( select sf_id , (1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0 as score from contacts order by score

我试图建立一个针对多个列的模糊搜索，每个列与相应搜索项之间的距离都有权重

我有以下疑问：

select sf_id 
from (
  select * 
  from (
     select sf_id , 
            (1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0  as score  
     from contacts 
     order by score desc 
     limit 1000
  ) as mailingcity

  union

  select * 
  from (
    select sf_id, 
           (1.0 - cast(coalesce(lastname, '') <->> 'anders' as float)) * 5.0 as score  
    from contacts 
    order by score 
    desc limit 1000
  ) as lastname
) 
as agg 
group by sf_id 
order by sum(score) desc

在用于匹配的列上

我们在表中有大约500000条记录，查询需要三秒钟

我还有关于coalesce函数的表达式索引

有没有办法加快速度

解释结果-

Sort  (cost=212791.05..212791.55 rows=200 width=154) (actual time=3165.154..3165.247 rows=2000 loops=1)
  Sort Key: (sum(((('1'::double precision - (((COALESCE(contacts.mailingcity, ''::character varying))::text <->> 'san ant'::text))::double precision) * '3'::double precision)))) DESC
  Sort Method: quicksort  Memory: 205kB
  ->  GroupAggregate  (cost=212766.41..212783.41 rows=200 width=154) (actual time=3163.855..3164.621 rows=2000 loops=1)
        Group Key: contacts.sf_id
        ->  Sort  (cost=212766.41..212771.41 rows=2000 width=154) (actual time=3163.847..3163.966 rows=2000 loops=1)
              Sort Key: contacts.sf_id
              Sort Method: quicksort  Memory: 205kB
              ->  HashAggregate  (cost=212616.75..212636.75 rows=2000 width=154) (actual time=3155.719..3156.055 rows=2000 loops=1)
                    Group Key: contacts.sf_id, ((('1'::double precision - (((COALESCE(contacts.mailingcity, ''::character varying))::text <->> 'san ant'::text))::double precision) * '3'::double precision))
                    ->  Append  (cost=106166.70..212606.75 rows=2000 width=154) (actual time=1629.241..3154.841 rows=2000 loops=1)
                          ->  Limit  (cost=106166.70..106283.37 rows=1000 width=27) (actual time=1629.241..1629.798 rows=1000 loops=1)
                                ->  Gather Merge  (cost=106166.70..154807.03 rows=416888 width=27) (actual time=1629.239..1629.730 rows=1000 loops=1)
                                      Workers Planned: 2
                                      Workers Launched: 2
                                      ->  Sort  (cost=105166.68..105687.79 rows=208444 width=27) (actual time=1589.059..1589.232 rows=1021 loops=3)
                                            Sort Key: ((('1'::double precision - (((COALESCE(contacts.mailingcity, ''::character varying))::text <->> 'san ant'::text))::double precision) * '3'::double precision)) DESC
                                            Sort Method: external merge  Disk: 7256kB
                                            ->  Parallel Seq Scan on contacts  (cost=0.00..81763.88 rows=208444 width=27) (actual time=0.145..1405.681 rows=166755 loops=3)
                          ->  Limit  (cost=106166.70..106283.37 rows=1000 width=27) (actual time=1524.305..1524.912 rows=1000 loops=1)
                                ->  Gather Merge  (cost=106166.70..154807.03 rows=416888 width=27) (actual time=1524.304..1524.842 rows=1000 loops=1)
                                      Workers Planned: 2
                                      Workers Launched: 2
                                      ->  Sort  (cost=105166.68..105687.79 rows=208444 width=27) (actual time=1455.159..1455.386 rows=1016 loops=3)
                                            Sort Key: ((('1'::double precision - (((COALESCE(contacts_1.lastname, ''::character varying))::text <->> 'anders'::text))::double precision) * '5'::double precision)) DESC
                                            Sort Method: external merge  Disk: 7280kB
                                            ->  Parallel Seq Scan on contacts contacts_1  (cost=0.00..81763.88 rows=208444 width=27) (actual time=0.373..1290.368 rows=166755 loops=3)
Planning time: 0.855 ms
Execution time: 3218.589 ms

排序（成本=212791.05..212791.55行=200宽度=154）（实际时间=3165.154..3165.247行=2000循环=1）
排序键：（sum（（（（'1'：：双精度-（（COALESCE（contacts.mailingcity，：：字符变化））：：text>'sanant'：：text））：：双精度）*'3'：：双精度）））描述
排序方法：快速排序内存：205kB
->GroupAggregate（成本=212766.41..212783.41行=200宽=154）（实际时间=3163.855..3164.621行=2000圈=1）
组密钥：contacts.sf\u id
->排序（成本=212766.41..212771.41行=2000宽度=154）（实际时间=3163.847..3163.966行=2000循环=1）
排序键：contacts.sf\u id
排序方法：快速排序内存：205kB
->HashAggregate（成本=212616.75..212636.75行=2000宽度=154）（实际时间=3155.719..3156.055行=2000循环=1）
组键：contacts.sf_id，（（（'1'：：双精度-（（COALESCE（contacts.mailingcity，，：：字符变化））：：text>'sanant'：：text））：：双精度）*'3'：：双精度））
->追加（成本=106166.70..212606.75行=2000宽度=154）（实际时间=1629.241..3154.841行=2000循环=1）
->限制（成本=106166.70..106283.37行=1000宽=27）（实际时间=1629.241..1629.798行=1000圈=1）
->聚集合并（成本=106166.70..154807.03行=416888宽度=27）（实际时间=1629.239..1629.730行=1000循环=1）
计划人数：2人
劳工处推出:2
->排序（成本=105166.68..105687.79行=208444宽度=27）（实际时间=1589.059..1589.232行=1021循环=3）
排序键：（（（'1'：：双精度-（（COALESCE（contacts.mailingcity，：：字符变化））：：text>'sanant'：：text））：：双精度）*'3'：：双精度）描述
排序方法：外部合并磁盘：7256kB
->触点上的并行顺序扫描（成本=0.00..81763.88行=208444宽度=27）（实际时间=0.145..1405.681行=166755圈=3）
->限制（成本=106166.70..106283.37行=1000宽=27）（实际时间=1524.305..1524.912行=1000圈=1）
->聚集合并（成本=106166.70..154807.03行=416888宽度=27）（实际时间=1524.304..1524.842行=1000循环=1）
计划人数：2人
劳工处推出:2
->排序（成本=105166.68..105687.79行=208444宽度=27）（实际时间=1455.159..1455.386行=1016圈=3）
排序键：（（'1'：：双精度-（（合并（contacts_1.lastname，：：字符变化））：：文本>anders:：文本）：双精度）*'5'：：双精度）描述
排序方法：外部合并磁盘：7280kB
->触点1上的平行顺序扫描（成本=0.00..81763.88行=208444宽度=27）（实际时间=0.373..1290.368行=166755圈=3）
计划时间：0.855毫秒
执行时间：3218.589毫秒

根据下面的@a_horse_和_no_名称的建议，下面的查询现在以大约250ms的速度运行

select sf_id from (select * from (
select sf_id , 
(1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0 
as score  
from contacts 
    where mailingcity % 'san ant' 
    order by score desc limit 1000) as mailingcity
union all
select * from (
select sf_id, 
(1.0 - cast(coalesce(lastname, '') <->> 'anders' as float)) * 5.0 as score  
from contacts 
    where lastname % 'anders'
    order by score desc limit 1000) as lastname) 
as agg group by sf_id order by sum(score) desc

select sf\u id from（select*from(
选择sf_id，
（1.0-cast（联合（mailingcity）、>“san ant”作为浮动））*3.0
作为分数
来自联系人
哪里有mailingcity%‘san ant’
按分数排序（限制1000）为mailingcity
联合所有
从中选择*(
选择sf_id，
（1.0-演员阵容（coalesce（lastname）、>“anders”为浮动演员阵容）*5.0为得分
来自联系人
其中lastname%'anders'
按分数排序（限制1000）作为姓氏）
按sf_id顺序按总和（分数）描述作为agg组

请添加

解释（分析）

输出并格式化查询，使其可读。第一个优化是使用

union all

而不是

union

，索引用于快速查找要计算的行-通常由

where

子句指定。您的单个查询请求所有需要排序的行，然后丢弃其中的大部分。如果在完整表达式和

sf\u id

上有一个索引可能会有帮助，那么我已经按照a\u horse\u用\u no\u name更新了解释文本request@a_horse_with_no_name谢谢根据您的建议，以下查询将缩短到250毫秒！！！选择sf_id from（选择*from（选择sf_id，（1.0-强制转换（coalesce（mailingcity）”>“san ant”作为浮动））*3.0作为来自联系人的分数，其中mailingcity%“san ant”按分数说明排序限制1000）作为mailingcity union所有选择*from（选择sf_id，（1.0-强制转换（coalesce（lastname）”>“anders”作为浮动））*5.0作为联系人的分数，其中lastname%'anders'order by score desc limit 1000）作为lastname）作为agg group by sf_id order by sum（score）desc

select sf_id from (select * from (
select sf_id , 
(1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0 
as score  
from contacts 
    where mailingcity % 'san ant' 
    order by score desc limit 1000) as mailingcity
union all
select * from (
select sf_id, 
(1.0 - cast(coalesce(lastname, '') <->> 'anders' as float)) * 5.0 as score  
from contacts 
    where lastname % 'anders'
    order by score desc limit 1000) as lastname) 
as agg group by sf_id order by sum(score) desc