Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/postgresql/9.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql Postgres三角图和排序非常慢_Sql_Postgresql_Group By - Fatal编程技术网

Sql Postgres三角图和排序非常慢

Sql Postgres三角图和排序非常慢,sql,postgresql,group-by,Sql,Postgresql,Group By,我试图建立一个针对多个列的模糊搜索,每个列与相应搜索项之间的距离都有权重 我有以下疑问: select sf_id from ( select * from ( select sf_id , (1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0 as score from contacts order by score

我试图建立一个针对多个列的模糊搜索,每个列与相应搜索项之间的距离都有权重

我有以下疑问:

select sf_id 
from (
  select * 
  from (
     select sf_id , 
            (1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0  as score  
     from contacts 
     order by score desc 
     limit 1000
  ) as mailingcity

  union

  select * 
  from (
    select sf_id, 
           (1.0 - cast(coalesce(lastname, '') <->> 'anders' as float)) * 5.0 as score  
    from contacts 
    order by score 
    desc limit 1000
  ) as lastname
) 
as agg 
group by sf_id 
order by sum(score) desc
在用于匹配的列上

我们在表中有大约500000条记录,查询需要三秒钟

我还有关于coalesce函数的表达式索引

有没有办法加快速度

解释结果-

Sort  (cost=212791.05..212791.55 rows=200 width=154) (actual time=3165.154..3165.247 rows=2000 loops=1)
  Sort Key: (sum(((('1'::double precision - (((COALESCE(contacts.mailingcity, ''::character varying))::text <->> 'san ant'::text))::double precision) * '3'::double precision)))) DESC
  Sort Method: quicksort  Memory: 205kB
  ->  GroupAggregate  (cost=212766.41..212783.41 rows=200 width=154) (actual time=3163.855..3164.621 rows=2000 loops=1)
        Group Key: contacts.sf_id
        ->  Sort  (cost=212766.41..212771.41 rows=2000 width=154) (actual time=3163.847..3163.966 rows=2000 loops=1)
              Sort Key: contacts.sf_id
              Sort Method: quicksort  Memory: 205kB
              ->  HashAggregate  (cost=212616.75..212636.75 rows=2000 width=154) (actual time=3155.719..3156.055 rows=2000 loops=1)
                    Group Key: contacts.sf_id, ((('1'::double precision - (((COALESCE(contacts.mailingcity, ''::character varying))::text <->> 'san ant'::text))::double precision) * '3'::double precision))
                    ->  Append  (cost=106166.70..212606.75 rows=2000 width=154) (actual time=1629.241..3154.841 rows=2000 loops=1)
                          ->  Limit  (cost=106166.70..106283.37 rows=1000 width=27) (actual time=1629.241..1629.798 rows=1000 loops=1)
                                ->  Gather Merge  (cost=106166.70..154807.03 rows=416888 width=27) (actual time=1629.239..1629.730 rows=1000 loops=1)
                                      Workers Planned: 2
                                      Workers Launched: 2
                                      ->  Sort  (cost=105166.68..105687.79 rows=208444 width=27) (actual time=1589.059..1589.232 rows=1021 loops=3)
                                            Sort Key: ((('1'::double precision - (((COALESCE(contacts.mailingcity, ''::character varying))::text <->> 'san ant'::text))::double precision) * '3'::double precision)) DESC
                                            Sort Method: external merge  Disk: 7256kB
                                            ->  Parallel Seq Scan on contacts  (cost=0.00..81763.88 rows=208444 width=27) (actual time=0.145..1405.681 rows=166755 loops=3)
                          ->  Limit  (cost=106166.70..106283.37 rows=1000 width=27) (actual time=1524.305..1524.912 rows=1000 loops=1)
                                ->  Gather Merge  (cost=106166.70..154807.03 rows=416888 width=27) (actual time=1524.304..1524.842 rows=1000 loops=1)
                                      Workers Planned: 2
                                      Workers Launched: 2
                                      ->  Sort  (cost=105166.68..105687.79 rows=208444 width=27) (actual time=1455.159..1455.386 rows=1016 loops=3)
                                            Sort Key: ((('1'::double precision - (((COALESCE(contacts_1.lastname, ''::character varying))::text <->> 'anders'::text))::double precision) * '5'::double precision)) DESC
                                            Sort Method: external merge  Disk: 7280kB
                                            ->  Parallel Seq Scan on contacts contacts_1  (cost=0.00..81763.88 rows=208444 width=27) (actual time=0.373..1290.368 rows=166755 loops=3)
Planning time: 0.855 ms
Execution time: 3218.589 ms
排序(成本=212791.05..212791.55行=200宽度=154)(实际时间=3165.154..3165.247行=2000循环=1)
排序键:(sum(((('1'::双精度-((COALESCE(contacts.mailingcity,::字符变化))::text>'sanant'::text))::双精度)*'3'::双精度)))描述
排序方法:快速排序内存:205kB
->GroupAggregate(成本=212766.41..212783.41行=200宽=154)(实际时间=3163.855..3164.621行=2000圈=1)
组密钥:contacts.sf\u id
->排序(成本=212766.41..212771.41行=2000宽度=154)(实际时间=3163.847..3163.966行=2000循环=1)
排序键:contacts.sf\u id
排序方法:快速排序内存:205kB
->HashAggregate(成本=212616.75..212636.75行=2000宽度=154)(实际时间=3155.719..3156.055行=2000循环=1)
组键:contacts.sf_id,((('1'::双精度-((COALESCE(contacts.mailingcity,,::字符变化))::text>'sanant'::text))::双精度)*'3'::双精度))
->追加(成本=106166.70..212606.75行=2000宽度=154)(实际时间=1629.241..3154.841行=2000循环=1)
->限制(成本=106166.70..106283.37行=1000宽=27)(实际时间=1629.241..1629.798行=1000圈=1)
->聚集合并(成本=106166.70..154807.03行=416888宽度=27)(实际时间=1629.239..1629.730行=1000循环=1)
计划人数:2人
劳工处推出:2
->排序(成本=105166.68..105687.79行=208444宽度=27)(实际时间=1589.059..1589.232行=1021循环=3)
排序键:((('1'::双精度-((COALESCE(contacts.mailingcity,::字符变化))::text>'sanant'::text))::双精度)*'3'::双精度)描述
排序方法:外部合并磁盘:7256kB
->触点上的并行顺序扫描(成本=0.00..81763.88行=208444宽度=27)(实际时间=0.145..1405.681行=166755圈=3)
->限制(成本=106166.70..106283.37行=1000宽=27)(实际时间=1524.305..1524.912行=1000圈=1)
->聚集合并(成本=106166.70..154807.03行=416888宽度=27)(实际时间=1524.304..1524.842行=1000循环=1)
计划人数:2人
劳工处推出:2
->排序(成本=105166.68..105687.79行=208444宽度=27)(实际时间=1455.159..1455.386行=1016圈=3)
排序键:(('1'::双精度-((合并(contacts_1.lastname,::字符变化))::文本>anders::文本):双精度)*'5'::双精度)描述
排序方法:外部合并磁盘:7280kB
->触点1上的平行顺序扫描(成本=0.00..81763.88行=208444宽度=27)(实际时间=0.373..1290.368行=166755圈=3)
计划时间:0.855毫秒
执行时间:3218.589毫秒
根据下面的@a_horse_和_no_名称的建议,下面的查询现在以大约250ms的速度运行

select sf_id from (select * from (
select sf_id , 
(1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0 
as score  
from contacts 
    where mailingcity % 'san ant' 
    order by score desc limit 1000) as mailingcity
union all
select * from (
select sf_id, 
(1.0 - cast(coalesce(lastname, '') <->> 'anders' as float)) * 5.0 as score  
from contacts 
    where lastname % 'anders'
    order by score desc limit 1000) as lastname) 
as agg group by sf_id order by sum(score) desc
select sf\u id from(select*from(
选择sf_id,
(1.0-cast(联合(mailingcity)、>“san ant”作为浮动))*3.0
作为分数
来自联系人
哪里有mailingcity%‘san ant’
按分数排序(限制1000)为mailingcity
联合所有
从中选择*(
选择sf_id,
(1.0-演员阵容(coalesce(lastname)、>“anders”为浮动演员阵容)*5.0为得分
来自联系人
其中lastname%'anders'
按分数排序(限制1000)作为姓氏)
按sf_id顺序按总和(分数)描述作为agg组

请添加
解释(分析)
输出并格式化查询,使其可读。第一个优化是使用
union all
而不是
union
,索引用于快速查找要计算的行-通常由
where
子句指定。您的单个查询请求所有需要排序的行,然后丢弃其中的大部分。如果在完整表达式和
sf\u id
上有一个索引可能会有帮助,那么我已经按照a\u horse\u用\u no\u name更新了解释文本request@a_horse_with_no_name谢谢根据您的建议,以下查询将缩短到250毫秒!!!选择sf_id from(选择*from(选择sf_id,(1.0-强制转换(coalesce(mailingcity)”>“san ant”作为浮动))*3.0作为来自联系人的分数,其中mailingcity%“san ant”按分数说明排序限制1000)作为mailingcity union所有选择*from(选择sf_id,(1.0-强制转换(coalesce(lastname)”>“anders”作为浮动))*5.0作为联系人的分数,其中lastname%'anders'order by score desc limit 1000)作为lastname)作为agg group by sf_id order by sum(score)desc
select sf_id from (select * from (
select sf_id , 
(1.0 - cast(coalesce(mailingcity, '') <->> 'san ant' as float)) * 3.0 
as score  
from contacts 
    where mailingcity % 'san ant' 
    order by score desc limit 1000) as mailingcity
union all
select * from (
select sf_id, 
(1.0 - cast(coalesce(lastname, '') <->> 'anders' as float)) * 5.0 as score  
from contacts 
    where lastname % 'anders'
    order by score desc limit 1000) as lastname) 
as agg group by sf_id order by sum(score) desc