如何优化大型表的Postgresql数组\ U AGG查询?
我使用PostgreSQL实现其数组功能。以下是我的模式:如何优化大型表的Postgresql数组\ U AGG查询?,sql,postgresql,query-optimization,Sql,Postgresql,Query Optimization,我使用PostgreSQL实现其数组功能。以下是我的模式: CREATE TABLE questions ( id INTEGER PRIMARY KEY, product_id INTEGER UNIQUE NOT NULL, body VARCHAR(1000) NOT NULL, date_written DATE NOT NULL DEFAULT current_date, asker_name VARCHAR(60) NOT NULL, asker_email
CREATE TABLE questions (
id INTEGER PRIMARY KEY,
product_id INTEGER UNIQUE NOT NULL,
body VARCHAR(1000) NOT NULL,
date_written DATE NOT NULL DEFAULT current_date,
asker_name VARCHAR(60) NOT NULL,
asker_email VARCHAR(60) NOT NULL,
reported BOOLEAN DEFAULT FALSE,
helpful INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE answers (
id PRIMARY KEY NOT NULL,
question_id INTEGER NOT NULL,
body VARCHAR(1000) NOT NULL,
date_written DATE NOT NULL DEFAULT current_date,
answerer_name VARCHAR(60) NOT NULL,
answerer_email VARCHAR(60) NOT NULL,
reported BOOLEAN DEFAULT FALSE,
helpful INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE photos (
id INTEGER UNIQUE,
answer_id INTEGER NOT NULL,
photo VARCHAR(200)
);
我正在尝试查询我的答案表,以获取给定问题id的所有答案的列表,并包含该给定答案id存在的所有照片的数组。结果应按有用性的降序排序。到目前为止,我有一个显示我正在查找的结果的大型查询,但执行时间是729.595毫秒。我正在尝试优化,以将查询时间缩短到200毫秒。我有以下索引来尝试优化我的查询时间:
indexname | indexdef
-----------------+---------------------------------------------------------------------------
answer_id | CREATE UNIQUE INDEX answer_id ON public.answers USING btree (id)
question_id | CREATE INDEX question_id ON public.answers USING btree (question_id)
idx_reported_id | CREATE INDEX idx_reported_id ON public.answers USING btree (reported, id)
answers_pkey | CREATE UNIQUE INDEX answers_pkey ON public.answers USING btree (id)
在我的分析中,我注意到GroupAggregate非常耗时:GroupAggregate(成本=126222.21..126222.71行=25宽度=129)(实际时间=729.497..729.506行=5个循环=1)
组密钥:answers.id
有什么方法可以避免耗时的小组讨论吗?我缺少索引了吗?以下是查询本身:
SELECT answers.id,
question_id,
body,
date_written,
answerer_name,
answerer_email,
reported,
helpful,
ARRAY_AGG(photo) as photos
FROM answers
LEFT JOIN photos ON answers.id = photos.answer_id
WHERE reported IS
false AND answers.id IN (SELECT id
FROM answers
WHERE question_id = 20012)
GROUP BY answers.id
ORDER BY helpful DESC;
谢谢 我认为您可以跳过子查询:
SELECT answers.id, question_id, body, date_written, answerer_name, answerer_email, reported, helpful, ARRAY_AGG(photo) as photos
FROM answers
LEFT JOIN photos ON answers.id = photos.answer_id
WHERE reported IS false AND question_id = 20012
GROUP BY answers.id, question_id, body, date_written, answerer_name, answerer_email, reported, helpful
ORDER BY helpful DESC;
您可以在photos.answer\u id上添加btree索引,因为此字段在join子句中使用
您丢失了GROUPBY子句上的相同字段 一种通常有效的方法是先聚合,然后在结果上加入(而不是聚合整个结果)。而且你也不需要这样的条件
SELECT a.id,
a.question_id,
a.body,
a.date_written,
a.answerer_name,
a.answerer_email,
a.reported,
a.helpful,
p.photos
FROM answers a
LEFT JOIN (
select answer_id, array_agg(photo) as photos
from photos
group by answer_id
) p ON a.id = p.answer_id
WHERE reported IS false
AND a.question_id = 20012
ORDER BY a.helpful DESC;
我将尝试更改WHERE order
WHERE QUOTE_id=20012,报告为false
。首先限制行的数量,但Postgresql可能会自动这样做。
SELECT answers.id, question_id, body, date_written, answerer_name, answerer_email, reported, helpful, ARRAY_AGG(photo) as photos
FROM answers
LEFT JOIN photos ON answers.id = photos.answer_id
WHERE reported IS false AND question_id = 20012
GROUP BY answers.id, question_id, body, date_written, answerer_name, answerer_email, reported, helpful
ORDER BY helpful DESC;
SELECT a.id,
a.question_id,
a.body,
a.date_written,
a.answerer_name,
a.answerer_email,
a.reported,
a.helpful,
p.photos
FROM answers a
LEFT JOIN (
select answer_id, array_agg(photo) as photos
from photos
group by answer_id
) p ON a.id = p.answer_id
WHERE reported IS false
AND a.question_id = 20012
ORDER BY a.helpful DESC;