Sql 按子查询排序的结果慢得离谱_Sql_Postgresql_Subquery

Sql 按子查询排序的结果慢得离谱

sql postgresql

Sql 按子查询排序的结果慢得离谱,sql,postgresql,subquery,Sql,Postgresql,Subquery,我有点困惑以下是我的（简化）查询：下面是我的解释分析结果： Limit (cost=46697025.51..46697025.56 rows=20 width=193) (actual time=80329.201..80329.206 rows=20 loops=1) -> Sort (cost=46697025.51..46724804.61 rows=11111641 width=193) (actual time=80329.199..80329.202 rows=

我有点困惑

以下是我的（简化）查询：

下面是我的

解释分析结果：
Limit  (cost=46697025.51..46697025.56 rows=20 width=193) (actual time=80329.201..80329.206 rows=20 loops=1)
  ->  Sort  (cost=46697025.51..46724804.61 rows=11111641 width=193) (actual time=80329.199..80329.202 rows=20 loops=1)
        Sort Key: ((SubPlan 1))
        Sort Method: top-N heapsort  Memory: 29kB
        ->  Seq Scan on documents  (cost=0.00..46401348.74 rows=11111641 width=193) (actual time=0.061..73275.304 rows=11114254 loops=1)
              SubPlan 1
                ->  Aggregate  (cost=3.95..4.05 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=11114254)
                      ->  Index Scan using registrations_document_id_index on registrations  (cost=0.43..3.95 rows=2 width=4) (actual time=0.004..0.004 rows=1 loops=11114254)
                            Index Cond: (document_id = documents.id)
Planning Time: 0.334 ms
Execution Time: 80329.287 ms

查询需要1m 20秒
才能执行，有没有办法对其进行优化？这些表中有很多行（文档：11114642；注册：13176070
）
在实际的完整查询中，我还有一些过滤器，执行起来需要4秒钟，但仍然太慢。这个子查询orderby似乎是这里的瓶颈，我无法找到优化它的方法
我尝试在日期/文档id列上设置索引。
尝试取消查询
SELECT documents.id,
   documents.other_attr,
   max(registrations.date) register_date
FROM documents
JOIN registrations ON registrations.document_id = documents.id
GROUP BY documents.id, documents.other_attr
ORDER BY 2
LIMIT 20

查询应至少由注册（文档id，日期）
上的索引支持：
不要使用标量子查询：
SELECT documents.*,
       reg.register_date
FROM documents
JOIN (
  SELECT document_id, max(date) as register_date
  FROM registrations
  GROUP BY document_id
) reg on reg.document_id = documents.id;
ORDER BY register_date
LIMIT 20;

在实际的完整查询中，我还有一些过滤器，执行起来需要4秒钟，但仍然太慢
然后询问这个问题。对于看不到的查询，我们能说些什么？显然，这另一个查询与此查询不同，只是在所有工作完成后过滤掉了一些内容，因为它不会比您向我们展示的查询更快（除了缓存热）。它正在做一些不同的事情，它必须以不同的方式进行优化
这个子查询orderby似乎是这里的瓶颈，我无法找到优化它的方法
排序节点的计时包括之前所有工作的时间，因此实际排序的时间为80329.206-73275.304=7秒，这可能是很长的时间，但只占总时间的一小部分。（从输出本身来看，这种解释不是很明显——它来自经验。）
对于您向我们展示的查询，您可以使用一个相当复杂的构造使其非常快速，但仅在概率上是正确的
with t as (select date, document_id from registrations 
    order by date desc, document_id desc limit 200), 
t2 as (select distinct on (document_id) document_id, date from t 
    order by document_id, date desc), 
t3 as ( select document_id, date from t2 order by date desc limit 20)
SELECT documents.*,
   t3.date as register_date
FROM documents join t3 on t3.document_id = documents.id;
order by register_date

它将得到以下方面的有效支持：
create index on registrations (register_date, document_id);
create index on documents(id);

这里的想法是，最近的200个注册将至少有20个不同的文档id。当然，没有办法确定这是真的，所以你可能不得不增加200到20000（与你目前正在做的相比，这应该还是相当快的），或者更多，以确保你得到正确的答案。这还假设每个不同的document\u id正好匹配一个document.id.这一个需要44秒才能执行：cI最近回答了类似的问题，它大大加快了类似的查询速度。这可能不太好，因为我看到Gordon给出的解决方案也有同样的想法，但后来删除了它。
with t as (select date, document_id from registrations 
    order by date desc, document_id desc limit 200), 
t2 as (select distinct on (document_id) document_id, date from t 
    order by document_id, date desc), 
t3 as ( select document_id, date from t2 order by date desc limit 20)
SELECT documents.*,
   t3.date as register_date
FROM documents join t3 on t3.document_id = documents.id;
order by register_date

create index on registrations (register_date, document_id);
create index on documents(id);