为什么此查询在PostgreSQL中不使用仅索引扫描？_Postgresql

为什么此查询在PostgreSQL中不使用仅索引扫描？

postgresql

为什么此查询在PostgreSQL中不使用仅索引扫描？,postgresql,Postgresql,我有一个表，有28列，7M条记录，没有主键 CREATE TABLE records ( direction smallint, exporters_id integer, time_stamp integer ... ) 我在这个表和真空表上创建索引，然后自动真空打开 CREATE INDEX exporter_dir_time_only_index ON sacopre_records USING btree (exporters_id, direction, time_s

我有一个表，有28列，7M条记录，没有主键

CREATE TABLE records (
  direction smallint,
  exporters_id integer,
  time_stamp integer
  ...
)

我在这个表和真空表上创建索引，然后自动真空打开

CREATE INDEX exporter_dir_time_only_index ON sacopre_records
USING btree (exporters_id, direction, time_stamp);

我想执行这个查询

SELECT count(exporters_id) FROM records WHERE exporters_id = 50

该表有6982224条出口商id=50的记录。我希望这个查询使用索引扫描来获得结果，但它使用了顺序扫描。这是解释分析输出：

Aggregate  (cost=204562.25..204562.26 rows=1 width=4) (actual time=1521.862..1521.862 rows=1 loops=1)
->  Seq Scan on sacopre_records (cost=0.00..187106.88 rows=6982149 width=4) (actual time=0.885..1216.211 rows=6982224 loops=1)
    Filter: (exporters_id = 50)
    Rows Removed by Filter: 2663
Total runtime: 1521.886 ms

但当我将导出器id更改为另一个id时，查询使用索引仅扫描

Aggregate  (cost=46.05..46.06 rows=1 width=4) (actual time=0.321..0.321 rows=1 loops=1)
->  Index Only Scan using exporter_dir_time_only_index on sacopre_records  (cost=0.43..42.85 rows=1281 width=4) (actual time=0.313..0.315 rows=4 loops=1)
    Index Cond: (exporters_id = 47)
    Heap Fetches: 0
Total runtime: 0.358 ms

问题出在哪里？

解释正在告诉您原因。仔细看看

Aggregate  (cost=204562.25..204562.26 rows=1 width=4) (actual time=1521.862..1521.862 rows=1 loops=1)
->  Seq Scan on sacopre_records (cost=0.00..187106.88 rows=6982149 width=4) (actual time=0.885..1216.211 rows=6982224 loops=1)
    Filter: (exporters_id = 50)
    Rows Removed by Filter: 2663
Total runtime: 1521.886 ms

您的筛选器只删除了表中6982149行总数中的2663行，因此执行顺序扫描应该比使用索引更快，因为磁头无论如何都应该通过6982149-2663=6979486条记录。磁头开始按顺序读取整个表，同时正在删除与您的条件不匹配的0.000004%的微小部分。而在索引扫描的情况下，它应该从索引文件跳转并返回到数据文件6979486次，这肯定会比你现在得到的1.5秒慢

你试过从记录中选择COUNTexporters_id=50吗？@Tordek，我现在测试它，得到相同的结果，它使用seq-scan。可能新索引没有分析，因此提交给planner？。。试真空分析records@VaoTsun，我在上面说过，我执行了真空分析，自动真空打开。然后返回到数据文件。。。但是他们正在对索引字段进行计数，当然引擎可以遍历索引并忽略数据吗？我同意@Tordek，这不需要返回数据文件！！选择结果为{50}。索引类型是btree，所以我认为从索引中获取结果比搜索数据文件更快。@KouberSaparev是最常见的值：这是错误的。在某些DBMS中可能会出现这种情况，但在PostgreSQL中，b树索引确实包含所有值（公共值或其他值），除非在索引定义中使用显式WHERE子句将其作为部分索引。这样做也很有用，比如只进行索引扫描，或者高效地返回按索引排序的结果。不过，您的第二点是正确的：这里很可能没有使用索引，因为它没有足够的选择性，使用enable_seqscan=off进行测试有助于查看相对成本估算。@Arshen您的随机页面成本和seq_page成本很可能无法准确反映系统的实际性能。或者规划师估计得不太好。不过，这并不是很大的区别。