在Postgresql中优化窗口查询_Sql_Postgresql_Window Functions_Postgresql Performance

在Postgresql中优化窗口查询

sql postgresql

在Postgresql中优化窗口查询,sql,postgresql,window-functions,postgresql-performance,Sql,Postgresql,Window Functions,Postgresql Performance,我有一个产品表，有大约17000000条记录 CREATE TABLE vendor_prices ( id serial PRIMARY KEY, vendor integer NOT NULL, sku character varying(25) NOT NULL, category_name character varying(100) NOT NULL, price numeric(8,5) NOT NULL, effective_date timestamp w

我有一个产品表，有大约17000000条记录

CREATE TABLE vendor_prices ( id serial PRIMARY KEY, vendor integer NOT NULL, sku character varying(25) NOT NULL, category_name character varying(100) NOT NULL, price numeric(8,5) NOT NULL, effective_date timestamp without time zone, expiration_date timestamp without time zone DEFAULT (now() + '1 year'::interval) ); 尽管我有一个问题，那就是执行的时间太长了，特别是对于有大量记录的供应商。此查询中的特定供应商

，其中vendor=516

有大约300万行，其中只有大约80K行不是多余的。如何改进此查询

下面是执行解释分析的结果：

Aggregate (cost=987648.74..987648.75 rows=1 width=0) (actual time=38220.825..38220.825 rows=1 loops=1) -> Subquery Scan on d (cost=862040.12..983596.85 rows=1620756 width=0) (actual time=31758.342..38211.262 rows=84245 loops=1) Filter: (NOT d.del) Rows Removed by Filter: 3094780 -> WindowAgg (cost=862040.12..951181.72 rows=3241513 width=25) (actual time=31758.220..37929.024 rows=3179025 loops=1) -> Sort (cost=862040.12..870143.90 rows=3241513 width=25) (actual time=31758.196..34952.249 rows=3179025 loops=1) Sort Key: vendor_prices.sku, vendor_prices.effective_date, vendor_prices.id Sort Method: external merge Disk: 123448kB -> Bitmap Heap Scan on vendor_prices (cost=60790.16..356386.08 rows=3241513 width=25) (actual time=350.911..1512.974 rows=3179025 loops=1) Recheck Cond: (vendor = 516) Heap Blocks: exact=47546 -> Bitmap Index Scan on idx_vendor_number (cost=0.00..59979.79 rows=3241513 width=0) (actual time=336.936..336.936 rows=3179025 loops=1) Index Cond: (vendor = 516) 聚合（成本=987648.74..987648.75行=1宽度=0）（实际时间=38220.825..38220.825行=1圈=1） ->d上的子查询扫描（成本=862040.12..983596.85行=1620756宽度=0）（实际时间=31758.342..38211.262行=84245循环=1）过滤器：（非d.del）被筛选器删除的行：3094780 ->WindowAgg（成本=862040.12..951181.72行=3241513宽度=25）（实际时间=31758.220..37929.024行=3179025循环=1） ->排序（成本=862040.12..870143.90行=3241513宽度=25）（实际时间=31758.196..34952.249行=3179025循环=1）排序键：vendor\u prices.sku、vendor\u prices.effective\u date、vendor\u prices.id 排序方法：外部合并磁盘：123448kB ->供应商价格位图堆扫描（成本=60790.16..356386.08行=3241513宽度=25）（实际时间=350.911..1512.974行=3179025循环=1）复查条件：（供应商=516）堆块：精确=47546 ->idx_供应商_编号上的位图索引扫描（成本=0.00..59979.79行=3241513宽度=0）（实际时间=336.936..336.936行=3179025循环=1）索引条件：（供应商=516）注：我有一个

多列索引

，正如@Erwin在他的回答中所建议的：

在
```
（供应商、sku、生效日期、id）
```
上的[多列索引将非常适合这种情况-以这种特定的顺序

但是它使用的是
idx\u供应商编号
，正如您在
解释分析
中看到的那样，这只出现在
供应商
列上
外部合并磁盘：123448kB
是您最大的问题。尝试增加
work\u mem
，直到这在内存中完成。@a\u horse\u，没有名称：谢谢-将work\u mem增加到512Meg，通过在内存中进行合并，总共减少了十几秒。这很有帮助，但我们真正需要的是至少提高一个数量级。 Aggregate (cost=987648.74..987648.75 rows=1 width=0) (actual time=38220.825..38220.825 rows=1 loops=1) -> Subquery Scan on d (cost=862040.12..983596.85 rows=1620756 width=0) (actual time=31758.342..38211.262 rows=84245 loops=1) Filter: (NOT d.del) Rows Removed by Filter: 3094780 -> WindowAgg (cost=862040.12..951181.72 rows=3241513 width=25) (actual time=31758.220..37929.024 rows=3179025 loops=1) -> Sort (cost=862040.12..870143.90 rows=3241513 width=25) (actual time=31758.196..34952.249 rows=3179025 loops=1) Sort Key: vendor_prices.sku, vendor_prices.effective_date, vendor_prices.id Sort Method: external merge Disk: 123448kB -> Bitmap Heap Scan on vendor_prices (cost=60790.16..356386.08 rows=3241513 width=25) (actual time=350.911..1512.974 rows=3179025 loops=1) Recheck Cond: (vendor = 516) Heap Blocks: exact=47546 -> Bitmap Index Scan on idx_vendor_number (cost=0.00..59979.79 rows=3241513 width=0) (actual time=336.936..336.936 rows=3179025 loops=1) Index Cond: (vendor = 516)