Sql 使用多个联接、分组依据和排序依据加快查询速度_Sql_Postgresql_Sqlperformance_Postgresql Performance

Sql 使用多个联接、分组依据和排序依据加快查询速度

sql postgresql

Sql 使用多个联接、分组依据和排序依据加快查询速度,sql,postgresql,sqlperformance,postgresql-performance,Sql,Postgresql,Sqlperformance,Postgresql Performance,我有一个SQL查询，如下所示： SELECT title, (COUNT(DISTINCT A.id)) AS "count_title" FROM B INNER JOIN D ON B.app = D.app INNER JOIN A ON D.number = A.number INNER JOIN C ON A.id = C.id GROUP BY C.title ORDER BY count_title DESC LIMIT 10 ; 表D包含5000万条记录，A包含30

我有一个SQL查询，如下所示：

SELECT
title,
(COUNT(DISTINCT A.id)) AS "count_title"

FROM 
B 
INNER JOIN D ON B.app = D.app
INNER JOIN A ON D.number = A.number 
INNER JOIN C ON A.id = C.id 

GROUP BY C.title
ORDER BY count_title DESC
LIMIT 10
;

表D包含5000万条记录，A包含3000万条记录，B&C各包含30k条记录。索引是在联接、分组依据、排序依据中使用的所有列上定义的

查询在没有ORDERBY语句的情况下运行良好，并在大约2-3秒内返回结果

但是，通过排序操作（order by），查询时间增加到10-12秒

我理解这背后的原因，执行者必须遍历所有记录进行排序操作，索引在这里几乎没有帮助

是否有其他方法可以加速此查询

以下是此查询的解释分析：

"QUERY PLAN"
"Limit  (cost=974652.20..974652.22 rows=10 width=54) (actual time=2817.579..2825.071 rows=10 loops=1)"
"  Buffers: shared hit=120299 read=573195"
"  ->  Sort  (cost=974652.20..974666.79 rows=5839 width=54) (actual time=2817.578..2817.578 rows=10 loops=1)"
"        Sort Key: (count(DISTINCT A.id)) DESC"
"        Sort Method: top-N heapsort  Memory: 26kB"
"        Buffers: shared hit=120299 read=573195"
"        ->  GroupAggregate  (cost=974325.65..974526.02 rows=5839 width=54) (actual time=2792.465..2817.097 rows=3618 loops=1)"
"              Group Key: C.title"
"              Buffers: shared hit=120299 read=573195"
"              ->  Sort  (cost=974325.65..974372.97 rows=18931 width=32) (actual time=2792.451..2795.161 rows=45175 loops=1)"
"                    Sort Key: C.title"
"                    Sort Method: quicksort  Memory: 5055kB"
"                    Buffers: shared hit=120299 read=573195"
"                    ->  Gather  (cost=968845.30..972980.74 rows=18931 width=32) (actual time=2753.402..2778.648 rows=45175 loops=1)"
"                          Workers Planned: 1"
"                          Workers Launched: 1"
"                          Buffers: shared hit=120299 read=573195"
"                          ->  Parallel Hash Join  (cost=967845.30..970087.64 rows=11136 width=32) (actual time=2751.725..2764.832 rows=22588 loops=2)"
"                                Hash Cond: ((C.id)::text = (A.id)::text)"
"                                Buffers: shared hit=120299 read=573195"
"                                ->  Parallel Seq Scan on C  (cost=0.00..1945.87 rows=66687 width=32) (actual time=0.017..4.316 rows=56684 loops=2)"
"                                      Buffers: shared read=1279"
"                                ->  Parallel Hash  (cost=966604.55..966604.55 rows=99260 width=9) (actual time=2750.987..2750.987 rows=20950 loops=2)"
"                                      Buckets: 262144  Batches: 1  Memory Usage: 4032kB"
"                                      Buffers: shared hit=120266 read=571904"
"                                      ->  Nested Loop  (cost=219572.23..966604.55 rows=99260 width=9) (actual time=665.832..2744.270 rows=20950 loops=2)"
"                                            Buffers: shared hit=120266 read=571904"
"                                            ->  Parallel Hash Join  (cost=219571.79..917516.91 rows=99260 width=4) (actual time=665.804..2583.675 rows=20950 loops=2)"
"                                                  Hash Cond: ((D.app)::text = (B.app)::text)"
"                                                  Buffers: shared hit=8 read=524214"
"                                                  ->  Parallel Bitmap Heap Scan on D  (cost=217542.51..895848.77 rows=5126741 width=13) (actual time=661.254..1861.862 rows=6160441 loops=2)"
"                                                        Recheck Cond: ((action_type)::text = ANY ('{10,11}'::text[]))"
"                                                        Heap Blocks: exact=242152"
"                                                        Buffers: shared hit=3 read=523925"
"                                                        ->  Bitmap Index Scan on D_index_action_type  (cost=0.00..214466.46 rows=12304178 width=0) (actual time=546.470..546.471 rows=12320882 loops=1)"
"                                                              Index Cond: ((action_type)::text = ANY ('{10,11}'::text[]))"
"                                                              Buffers: shared hit=3 read=33669"
"                                                  ->  Parallel Hash  (cost=1859.36..1859.36 rows=13594 width=12) (actual time=4.337..4.337 rows=16313 loops=2)"
"                                                        Buckets: 32768  Batches: 1  Memory Usage: 1152kB"
"                                                        Buffers: shared hit=5 read=289"
"                                                        ->  Parallel Index Only Scan using B_index_app on B  (cost=0.29..1859.36 rows=13594 width=12) (actual time=0.015..2.218 rows=16313 loops=2)"
"                                                              Heap Fetches: 0"
"                                                              Buffers: shared hit=5 read=289"
"                                            ->  Index Scan using A_index_number on A  (cost=0.43..0.48 rows=1 width=24) (actual time=0.007..0.007 rows=1 loops=41900)"
"                                                  Index Cond: ((number)::text = (D.number)::text)"
"                                                  Buffers: shared hit=120258 read=47690"
"Planning Time: 0.747 ms"
"Execution Time: 2825.118 ms"

您可以尝试在

和

之间建立嵌套循环联接，因为

要小得多：

CREATE INDEX ON d (app);

如果

的吸尘频率足够高，您可以看到仅索引扫描是否更快。为此，在索引中包含

number

（在v11中，使用

include

子句来实现这一点！）。

EXPLAIN

输出表明您在

操作类型上有一个额外条件

；对于仅索引扫描，您也必须包含该列。

您可以将查询的

EXPLAIN（ANALYZE，BUFFERS）

输出添加到问题中吗？我已经添加了它@LaurenzAlbe，我猜查询是缓存的，所以它执行得比较快。执行计划来自不同的查询。我看到

d.action\u在那里输入（'10'，'11'）

。是的，这是一个附加条件。但是它不会有太大的区别。它会有区别，因为在这种情况下，你可以在

上尝试只扫描索引。创建覆盖索引：

创建索引d\u覆盖索引d（应用程序）包括（数字、动作类型）

执行者仍然要进行位图堆扫描。a）你是

真空d

？b）您是否仅在

应用程序

上尝试索引？是。我用吸尘器吸d。还定义了应用程序上的单个索引。但是，执行计划仍然没有改变。而且，加入A&C后，行数约为8万行。在表D中，应用action_type约束后，记录数为4M和3个循环（使用位图堆扫描）。这一步占用了大部分的执行时间。好的，那么我看不到让查询更快的希望了。你可以向它扔硬件（RAM）。