Postgresql Postgres slow，其中id是100M行表上的任何查询_Postgresql

Postgresql Postgres slow，其中id是100M行表上的任何查询

postgresql

Postgresql Postgres slow，其中id是100M行表上的任何查询,postgresql,Postgresql,我有一个大的“分数”表，包含超过1亿行，格式如下： (critic_id, book_id, score) 我有一个主键约束： CONSTRAINT pk_scoresid PRIMARY KEY (critic_id, book_id) 大约有225.000本书，大约有500位评论家执行查询，例如： SELECT * FROM scores s WHERE s.critic_id = ANY(array[1,2,3,4,5]) 上面的查询返回大约120万行。这在我的本地机器上大约需要3

我有一个大的“分数”表，包含超过1亿行，格式如下：

(critic_id, book_id, score)

我有一个主键约束：

CONSTRAINT pk_scoresid PRIMARY KEY (critic_id, book_id)

大约有225.000本书，大约有500位评论家

执行查询，例如：

SELECT *
FROM scores s
WHERE s.critic_id = ANY(array[1,2,3,4,5])

上面的查询返回大约120万行。这在我的本地机器上大约需要35秒。我真的希望它是如果这是您应用于表上查询的常用谓词，在其中您选择了一个相当小的批评家id子集，那么通过对批评家id进行索引，您可以在该索引上对表进行集群以物理地使用它

这将共同定位具有相同critic_id值的所有行，并提高该索引被使用的可能性和使用时的性能

如果表中已经存在一些固有的按book\u id的聚类，则可能会损害按book\u id选择的查询的性能。

可能是您应该执行数据反规范化：

CREATE TYPE book_score AS (
   book_id int,
   score int
);

create table score(
   critic_id int primary key,
   scores book_score[]
);

当然，它会在插入和更新时产生问题，但它会从根本上减少表的大小

尝试以下查询：

SELECT *
FROM scores s
WHERE s.critic_id = 1
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 2
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 3
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 4
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 5

并且（最终）只在批评家id上添加索引。

你在批评家id上使用什么类型的索引？使用更为传统的方法是：

s.critic\u id in（1,2,3,4,5）

尝试这样做：请更新您的帖子，以包含您为表格定义的索引列表和

EXPLAIN SELECT*的输出，其中s.critic_id在（1,2,3,4,5）中您可能应该尝试优化查询，以获得您想要的最终结果，而不是此结果。我做了以下操作，但这并没有大大提高速度。我做得对吗<代码>使用btree（评论家id）在分数上创建索引分数索引；在分数索引上更改表分数簇
CREATE INDEX score_index
  ON score
  USING btree
  (critic_id);
ALTER TABLE score CLUSTER ON score_index;

EXPLAIN SELECT * FROM scores s WHERE s.critic_id in (1, 2, 3, 4, 5);

"Bitmap Heap Scan on score s  (cost=22183.58..646085.28 rows=1188223 width=16)"
"  Recheck Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"
"  ->  Bitmap Index Scan on scores_index  (cost=0.00..21886.53 rows=1188223 width=0)"
"        Index Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"

EXPLAIN (analyze, verbose) SELECT * FROM scores WHERE s.critic_id = 1 OR s.critic_id = 2 OR s.critic_id = 3 OR s.critic_id = 4 OR s.critic_id = 5

"Bitmap Heap Scan on public.scores s  (cost=23433.49..654761.58 rows=1183187 width=16) (actual time=145.373..7078.141 rows=1121375 loops=1)"
"  Output: critic_id, book_id, score"
"  Recheck Cond: ((s.critic_id = 1) OR (s.critic_id = 2) OR (s.critic_id = 3) OR (s.critic_id = 4) OR (s.critic_id = 5))"
"  Rows Removed by Index Recheck: 33440779"
"  Heap Blocks: exact=43398 lossy=185726"
"  ->  BitmapOr  (cost=23433.49..23433.49 rows=1188223 width=0) (actual time=137.729..137.729 rows=0 loops=1)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=60.175..60.175 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 1)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.473..18.473 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 2)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=21.429..21.429 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 3)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.918..18.918 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 4)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..5493.86 rows=297239 width=0) (actual time=18.729..18.729 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 5)"

CREATE TYPE book_score AS (
   book_id int,
   score int
);

create table score(
   critic_id int primary key,
   scores book_score[]
);

SELECT *
FROM scores s
WHERE s.critic_id = 1
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 2
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 3
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 4
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 5