Postgresql Postgres slow,其中id是100M行表上的任何查询

Postgresql Postgres slow,其中id是100M行表上的任何查询,postgresql,Postgresql,我有一个大的“分数”表,包含超过1亿行,格式如下: (critic_id, book_id, score) 我有一个主键约束: CONSTRAINT pk_scoresid PRIMARY KEY (critic_id, book_id) 大约有225.000本书,大约有500位评论家 执行查询,例如: SELECT * FROM scores s WHERE s.critic_id = ANY(array[1,2,3,4,5]) 上面的查询返回大约120万行。这在我的本地机器上大约需要3

我有一个大的“分数”表,包含超过1亿行,格式如下:

(critic_id, book_id, score)
我有一个主键约束:

CONSTRAINT pk_scoresid PRIMARY KEY (critic_id, book_id)
大约有225.000本书,大约有500位评论家

执行查询,例如:

SELECT *
FROM scores s
WHERE s.critic_id = ANY(array[1,2,3,4,5])

上面的查询返回大约120万行。这在我的本地机器上大约需要35秒。我真的希望它是如果这是您应用于表上查询的常用谓词,在其中您选择了一个相当小的批评家id子集,那么通过对批评家id进行索引,您可以在该索引上对表进行集群以物理地使用它

这将共同定位具有相同critic_id值的所有行,并提高该索引被使用的可能性和使用时的性能


如果表中已经存在一些固有的按book\u id的聚类,则可能会损害按book\u id选择的查询的性能。

可能是您应该执行数据反规范化:

CREATE TYPE book_score AS (
   book_id int,
   score int
);

create table score(
   critic_id int primary key,
   scores book_score[]
);

当然,它会在插入和更新时产生问题,但它会从根本上减少表的大小

尝试以下查询:

SELECT *
FROM scores s
WHERE s.critic_id = 1
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 2
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 3
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 4
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 5

并且(最终)只在批评家id上添加索引。

你在批评家id上使用什么类型的索引?使用更为传统的方法是:
s.critic\u id in(1,2,3,4,5)
尝试这样做:请更新您的帖子,以包含您为表格定义的索引列表和
EXPLAIN SELECT*的输出,其中s.critic_id在(1,2,3,4,5)中您可能应该尝试优化查询,以获得您想要的最终结果,而不是此结果。我做了以下操作,但这并没有大大提高速度。我做得对吗<代码>使用btree(评论家id)在分数上创建索引分数索引;在分数索引上更改表分数簇
CREATE INDEX score_index
  ON score
  USING btree
  (critic_id);
ALTER TABLE score CLUSTER ON score_index;
EXPLAIN SELECT * FROM scores s WHERE s.critic_id in (1, 2, 3, 4, 5);

"Bitmap Heap Scan on score s  (cost=22183.58..646085.28 rows=1188223 width=16)"
"  Recheck Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"
"  ->  Bitmap Index Scan on scores_index  (cost=0.00..21886.53 rows=1188223 width=0)"
"        Index Cond: (detector_id = ANY ('{1,2,3,4,5}'::integer[]))"
EXPLAIN (analyze, verbose) SELECT * FROM scores WHERE s.critic_id = 1 OR s.critic_id = 2 OR s.critic_id = 3 OR s.critic_id = 4 OR s.critic_id = 5

"Bitmap Heap Scan on public.scores s  (cost=23433.49..654761.58 rows=1183187 width=16) (actual time=145.373..7078.141 rows=1121375 loops=1)"
"  Output: critic_id, book_id, score"
"  Recheck Cond: ((s.critic_id = 1) OR (s.critic_id = 2) OR (s.critic_id = 3) OR (s.critic_id = 4) OR (s.critic_id = 5))"
"  Rows Removed by Index Recheck: 33440779"
"  Heap Blocks: exact=43398 lossy=185726"
"  ->  BitmapOr  (cost=23433.49..23433.49 rows=1188223 width=0) (actual time=137.729..137.729 rows=0 loops=1)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=60.175..60.175 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 1)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.473..18.473 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 2)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=21.429..21.429 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 3)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..4115.16 rows=222746 width=0) (actual time=18.918..18.918 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 4)"
"        ->  Bitmap Index Scan on scores_index  (cost=0.00..5493.86 rows=297239 width=0) (actual time=18.729..18.729 rows=224275 loops=1)"
"              Index Cond: (s.critic_id = 5)"
CREATE TYPE book_score AS (
   book_id int,
   score int
);

create table score(
   critic_id int primary key,
   scores book_score[]
);
SELECT *
FROM scores s
WHERE s.critic_id = 1
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 2
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 3
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 4
   UNION ALL
SELECT *
FROM scores s
WHERE s.critic_id = 5