如果一个依赖列是时间戳，如何优化postgresql查询_Postgresql_Query Optimization_Query Planner

如果一个依赖列是时间戳，如何优化postgresql查询

postgresql

如果一个依赖列是时间戳，如何优化postgresql查询,postgresql,query-optimization,query-planner,Postgresql,Query Optimization,Query Planner,我有一个带有外键和时间戳的表，用于记录行最近更新的时间。具有相同外键值的行大约在同一时间更新，正负一小时。我有外键索引，时间戳。这是在postgresql 11上当我进行如下查询时：从表中选择*，其中外键=1，时间戳>2，按主键排序如果时间戳查询在整个表中是选择性的，它将使用我的索引。但是，如果时间戳在过去足够远，以至于大多数行都匹配，那么它将扫描主键索引，假设它会更快。如果我取消订单，这个问题就会消失我已经看过Postgresql的CREATE STATISTICS，但是如果相关性超过

我有一个带有外键和时间戳的表，用于记录行最近更新的时间。具有相同外键值的行大约在同一时间更新，正负一小时。我有外键索引，时间戳。这是在postgresql 11上

当我进行如下查询时：

从表中选择*，其中外键=1，时间戳>2，按主键排序

如果时间戳查询在整个表中是选择性的，它将使用我的索引。但是，如果时间戳在过去足够远，以至于大多数行都匹配，那么它将扫描主键索引，假设它会更快。如果我取消订单，这个问题就会消失

我已经看过Postgresql的CREATE STATISTICS，但是如果相关性超过一系列值，比如时间戳加上或减去五分钟，而不是特定的值，那么它似乎没有帮助

解决这个问题的最佳方法是什么？我可以通过删除订单，但这会使业务逻辑复杂化。我可以根据外键id对表进行分区，但这也是一个相当昂贵的更改

具体内容：

                                            Table "public.property_home_attributes"
        Column        |            Type             | Collation | Nullable |                       Default
----------------------+-----------------------------+-----------+----------+------------------------------------------------------
 id                   | integer                     |           | not null | nextval('property_home_attributes_id_seq'::regclass)
 mls_id               | integer                     |           | not null |
 property_id          | integer                     |           | not null |
 formatted_attributes | jsonb                       |           | not null |
 created_at           | timestamp without time zone |           |          |
 updated_at           | timestamp without time zone |           |          |
Indexes:
    "property_home_attributes_pkey" PRIMARY KEY, btree (id)
    "index_property_home_attributes_on_property_id" UNIQUE, btree (property_id)
    "index_property_home_attributes_on_updated_at" btree (updated_at)
    "property_home_attributes_mls_id_updated_at_idx" btree (mls_id, updated_at)

该表大约有1600万行

psql=# EXPLAIN ANALYZE SELECT * FROM property_home_attributes WHERE mls_id = 46 AND (property_home_attributes.updated_at < '2019-10-30 16:52:06.326774') ORDER BY id ASC LIMIT 1000;
                                                                                     QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..10147.83 rows=1000 width=880) (actual time=1519.718..22310.674 rows=1000 loops=1)
   ->  Index Scan using property_home_attributes_pkey on property_home_attributes  (cost=0.56..6094202.57 rows=600576 width=880) (actual time=1519.716..22310.398 rows=1000 loops=1)
         Filter: ((updated_at < '2019-10-30 16:52:06.326774'::timestamp without time zone) AND (mls_id = 46))
         Rows Removed by Filter: 358834
 Planning Time: 0.110 ms
 Execution Time: 22310.842 ms
(6 rows)

然后在没有命令的情况下：

psql=# EXPLAIN ANALYZE SELECT * FROM property_home_attributes WHERE mls_id = 46 AND (property_home_attributes.updated_at < '2019-10-30 16:52:06.326774')  LIMIT 1000;
                                                                     QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.56..1049.38 rows=1000 width=880) (actual time=0.053..162.081 rows=1000 loops=1)
   ->  Index Scan using foo on property_home_attributes  (cost=0.56..629893.60 rows=600576 width=880) (actual time=0.053..161.992 rows=1000 loops=1)
         Index Cond: ((mls_id = 46) AND (updated_at < '2019-10-30 16:52:06.326774'::timestamp without time zone))
 Planning Time: 0.100 ms
 Execution Time: 162.140 ms
(5 rows)

如果您不想让PostgreSQL在属性\u home\u attributes\u pkey上使用索引扫描来支持ORDER BY，只需使用

ORDER BY primary_key + 0

在我的例子中，我认为我应该只使用一个整数列来进行版本控制，在执行更新时增加该列，而不是更新时间戳，因为这将简化索引，并且对于postgresql多元统计数据，跨列依赖关系将更加清晰。不过，我暂时不讨论这个问题，因为我很好奇，对于我遇到的问题，是否有一个更普遍的解决方案。