Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/postgresql/9.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Sql Postgres查询速度慢,尽管使用了索引_Sql_Postgresql_Query Performance - Fatal编程技术网

Sql Postgres查询速度慢,尽管使用了索引

Sql Postgres查询速度慢,尽管使用了索引,sql,postgresql,query-performance,Sql,Postgresql,Query Performance,我有以下表格: 主lead表有近500M行: create table lead ( id integer, client_id integer, insert_date integer (a transformed date that looks like 20201231) ) create index lead_id_index on lead (id); create index

我有以下表格:

lead
表有近500M行:

create table lead
(
    id                  integer,
    client_id           integer,
    insert_date         integer  (a transformed date that looks like 20201231)
)

create index lead_id_index
    on lead (id);

create index lead_insert_date_index
    on lead (insert_date) include (id, client_id);

create index lead_client_id_index
    on lead (client_id) include (id, insert_date);
然后是其他桌子

create table last_activity_with_client
(
    lead_id       integer,
    last_activity timestamp,
    last_modified timestamp,
    client_id     integer
);

create index last_activity_with_client_client_id_index
    on last_activity_with_client (client_id) include (lead_id, last_activity);

create index last_activity_with_client_last_activity_index
    on last_activity_with_client (last_activity desc);

create index last_activity_with_client_lead_id_client_id_index
    on last_activity_with_client (lead_id, client_id);


create table lead_last_response_time
(
    lead_id            integer,
    last_response_time timestamp,
    last_modified      timestamp
);

create index lead_last_response_time_last_response_time_index
    on lead_last_response_time (last_response_time desc);

create index lead_last_response_time_lead_id_index
    on lead_last_response_time (lead_id);



create table lead_last_response_time
(
    lead_id            integer,
    last_response_time timestamp,
    last_modified      timestamp
);

create index lead_last_response_time_last_response_time_index
    on lead_last_response_time (last_response_time desc);

create index lead_last_response_time_lead_id_index
    on lead_last_response_time (lead_id);



create table date_dimensions
(
    key                      integer,  (a transformed date that looks like 20201231)
    date                     date,
    description              varchar(256),
    day                      smallint,
    month                    smallint,
    quarter                  char(2),
    year                     smallint
    past_30                  boolean
);

create index date_dimensions_key_index
    on date_dimensions (key);

我尝试在不同的
client\u id
上运行以下查询,但在
lead\u表中
client\u id
上的位图索引扫描总是会减慢查询速度

EXPLAIN ANALYZE
with TempResult AS (
    select DISTINCT lead.id AS lead_id,
                    last_activity_join.last_activity,
                    lead_last_response_time.last_response_time
    from lead
             left join (select * from last_activity_with_client where client_id = 13189) last_activity_join on
        lead.id = last_activity_join.lead_id

             left join lead_last_response_time lead_last_response_time on
        lead.id = lead_last_response_time.lead_id

             join date_dimensions date_dimensions on
        lead.insert_date = date_dimensions.key

    where (date_dimensions.past_30 = true)
      and (lead.client_id in (13189))
),
     TempCount AS (
         select COUNT(*) as total_rows
         fromt TempResult
     )
select *
from TempResult, TempCount
order by last_response_time desc NULLS LAST
limit 25 offset 1;
一些结果:

正如你所看到的,它正在使用索引,但速度相当慢。总是超过50秒我可以做些什么来加快查询速度?我也可以自由更改查询和表

Try this:

        EXPLAIN ANALYZE
          with TempResult AS (
                select DISTINCT lead.id AS lead_id,
                last_activity,
                last_response_time 
                from (
                select key 
                from date_dimensions 
                where past_30 = true
                ) date_dimensions
                join (select id, 
                insert_date 
                from lead 
                where client_id = 13189
                ) lead on lead.insert_date = date_dimensions.key
                left join (
                select lead_id, 
                last_activity 
                from last_activity_with_client 
                where client_id = 13189
                ) last_activity_join on lead.id = last_activity_join.lead_id
                left join lead_last_response_time lead_last_response_time on lead.id = lead_last_response_time.lead_id
    ),
     TempCount AS (
         select COUNT(*) as total_rows
         from TempResult
     )
select *
from TempResult, TempCount
order by last_response_time desc NULLS LAST
limit 25 offset 1;
或者这个:

    EXPLAIN ANALYZE
          with TempResult AS (
                select DISTINCT lead.id AS lead_id,
                last_activity,
                last_response_time 
                from  date_dimensions date_dimensions
                join (select id, 
                insert_date 
                from lead 
                where client_id = 13189
                ) lead on lead.insert_date = date_dimensions.key
                left join (
                select lead_id, 
                last_activity 
                from last_activity_with_client 
                where client_id = 13189
                ) last_activity_join on lead.id = last_activity_join.lead_id
                left join lead_last_response_time lead_last_response_time on lead.id = lead_last_response_time.lead_id
                where date_dimensions.past_30 = true
    ),
     TempCount AS (
         select COUNT(*) as total_rows
         from TempResult
     )
select *
from TempResult, TempCount
order by last_response_time desc NULLS LAST
limit 25 offset 1;
为了在该查询中有效使用,应改为在lead(客户id,插入日期,id)上使用
。使用INCLUDE只会降低索引的实用性,而不会产生任何效果。我认为使用INCLUDE的唯一原因是如果索引在列的子集上是唯一的,或者如果要包含的列是不支持btree操作的类型


但即使是现有的指数也似乎出人意料地缓慢。我想知道它是否有什么问题,比如碎片,或者它位于磁盘的损坏部分,在成功读取之前必须反复重试。

您不使用
TempCount
,所以您可以从消除它开始。您的结果必须是(13189)中的客户端id(或其他特定的客户端id)还是你这样做是为了测试?@GordonLinoff编辑了query@StefanDzalev所有查询都会在客户端id上进行筛选。这不仅仅是为了测试。您能否编辑问题并用表名限定所有列?否则很难阅读查询。谢谢!我尝试过这种方法,但是一个简单的
SELECT*FROM lead,其中client_id=12345
本身即使使用索引也需要很长时间。有时是
位图索引扫描
,有时是
并行顺序扫描
,具体取决于客户端保存的数据量。但在所有情况下,这都需要很长的时间。我建议您检查客户id索引上是否存在碎片。如果它是碎片化的,您将不得不重新组织它,这将使您的查询更快。谢谢我跟着你的帖子跑了
VACUUM FULL
,但我仍然看到相同的响应时间。我创建了这个索引并跑了VACUUM。对于小于30天的日期范围,其给出的值小于1s。但对于较大的时间范围,它会迅速恶化,达到15秒以上。
create index lead_client_id_index
    on lead (client_id) include (id, insert_date);