为什么限制数量会影响Postgresql处理select查询？_Sql_Postgresql_Indexing_Database Performance

为什么限制数量会影响Postgresql处理select查询？

sql postgresql indexing

为什么限制数量会影响Postgresql处理select查询？,sql,postgresql,indexing,database-performance,Sql,Postgresql,Indexing,Database Performance,我有一个名为user\u profiles的表，现在有超过2400万行。我需要检索所有数据并将其索引到Elasticsearch中我编写了一个程序来使用和转换数据，使它们无法灵活地转换为ES。我使用xorm中的.Rows（）从数据库中选择数据，这样内存就不会被破坏。它过去工作得很好我试图再次重新索引所有文档，但我发现从DB加载数据的速度要慢得多。过去，当我使用orderby运行selectall查询时，它几乎立即返回第一行，但现在不是了我检查了explain语句，发现如果我选择限制为13.

我有一个名为

user\u profiles

的表，现在有超过2400万行。我需要检索所有数据并将其索引到Elasticsearch中

我编写了一个程序来使用和转换数据，使它们无法灵活地转换为ES。我使用xorm中的

.Rows（）

从数据库中选择数据，这样内存就不会被破坏。它过去工作得很好

我试图再次重新索引所有文档，但我发现从DB加载数据的速度要慢得多。过去，当我使用orderby运行selectall查询时，它几乎立即返回第一行，但现在不是了

我检查了explain语句，发现如果我选择限制为13.05M的文档，它将使用与查询的order by匹配的索引，但不会超过13.06M

我记得上次为文档编制索引时，它大约在10米左右

限制在13050000以内

- Plan: 
    Node Type: "Limit"
    Parallel Aware: false
    Startup Cost: 0.56
    Total Cost: 30928006.04
    Plan Rows: 13050000
    Plan Width: 592
    Plans: 
      - Node Type: "Index Scan"
        Parent Relationship: "Outer"
        Parallel Aware: false
        Scan Direction: "Forward"
        Index Name: "user_profiles_pkey"
        Relation Name: "user_profiles"
        Alias: "user_profiles"
        Startup Cost: 0.56
        Total Cost: 56959518.12
        Plan Rows: 24033936
        Plan Width: 592

限值为13060000时：

- Plan: 
    Node Type: "Limit"
    Parallel Aware: false
    Startup Cost: 30605613.02
    Total Cost: 30638284.91
    Plan Rows: 13060000
    Plan Width: 592
    Plans: 
      - Node Type: "Sort"
        Parent Relationship: "Outer"
        Parallel Aware: false
        Startup Cost: 30605613.02
        Total Cost: 30665697.86
        Plan Rows: 24033936
        Plan Width: 592
        Sort Key: 
          - "user_id"
          - "system_name"
        Plans: 
          - Node Type: "Seq Scan"
            Parent Relationship: "Outer"
            Parallel Aware: false
            Relation Name: "user_profiles"
            Alias: "user_profiles"
            Startup Cost: 0.00
            Total Cost: 2357864.36
            Plan Rows: 24033936
            Plan Width: 592

我看到AWS RDS监控工具中存在巨大的读写IOPS。我认为DB试图重新创建排序，而忽略了可以直接使用主索引这一事实。我能做什么

以下是解释查询：

EXPLAIN ( FORMAT YAML )
SELECT *
FROM "user_profiles"
ORDER BY "user_id", "system_name"
LIMIT 13050000;

这是表格结构

CREATE TABLE user_profiles
(
    user_id     UUID        NOT NULL,
    system_name VARCHAR(50) NOT NULL,
    key_values  TEXT        NOT NULL
        CONSTRAINT user_profiles_pk PRIMARY KEY (user_id, system_name)
);

索引扫描是一种随机读取，通常比顺序读取更昂贵。但这几乎不取决于您的存储设备。例如，对于HDD来说，速度大约慢10倍。Postgres planner使用期望值来选择更好的。当seqscan预计将更有效时，通过增加限制，您将到达边界

请考虑进行几次查询，而不是立即获取13M条记录。

在VCUCUM和分析之后会发生什么？没有变化。