Postgresql 结合关系查询提高Postgres jsonb查询的性能
我有一个SELECT查询常规postgres表和一个jsonb列。 当我选择整个jsonb列时,查询是fast 574 ms。 但是,当我选择同一jsonb列的顶级路径时,查询速度会减慢6 x 3241ms。我的最后一个查询需要从这些顶级jsonb路径中的4个访问字符串数组值,这会将查询速度降到5秒 我在cfiles表中有大约50K条记录,jsonb列cfiles.property_值的结构如下:Postgresql 结合关系查询提高Postgres jsonb查询的性能,postgresql,query-optimization,jsonb,postgresql-12,Postgresql,Query Optimization,Jsonb,Postgresql 12,我有一个SELECT查询常规postgres表和一个jsonb列。 当我选择整个jsonb列时,查询是fast 574 ms。 但是,当我选择同一jsonb列的顶级路径时,查询速度会减慢6 x 3241ms。我的最后一个查询需要从这些顶级jsonb路径中的4个访问字符串数组值,这会将查询速度降到5秒 我在cfiles表中有大约50K条记录,jsonb列cfiles.property_值的结构如下: WITH i(id) AS ( -- core piece of SQL to selec
WITH i(id) AS (
-- core piece of SQL to select the records you're looking for
SELECT
cfiles.ID
FROM
cfiles
JOIN datasets ON datasets.ID = cfiles.dataset_id
WHERE
cfiles.tid = 5
ORDER BY
cfiles.last_modified DESC
LIMIT 20 OFFSET 0
)
SELECT-- FAST OPTION: getting all of json: no GIN=579ms; with GIN=574ms
cfiles.property_values AS "1907",
-- == vs ==
-- SLOW OPTION: getting a json path: no GIN=3273ms; with GIN=3241ms
cfiles.property_values #>> '{"Sample Names"}' AS "1907",
-- adding another path: with GIN=4028ms
cfiles.property_values #>> '{"Project IDs"}' AS "1908",
-- adding yet another path: with GIN=4774ms
cfiles.property_values #>> '{"Run IDs"}' AS "1909",
-- adding yet another path: with GIN=5558ms
cfiles.property_values #>> '{"Data Type"}' AS "1910",
-- ==== rest of query below I can't change ====
user_permissions.notified_at :: TEXT AS "101",
group_permissions.notified_at :: TEXT AS "102",
user_permissions.task_id :: TEXT AS "103",
group_permissions.task_id :: TEXT AS "104",
datasets.ID AS "151",
datasets.NAME AS "154",
datasets.PATH AS "155",
datasets.last_modified AS "156",
datasets.file_count AS "157",
datasets.locked AS "158",
datasets.content_types AS "159",
cfiles.NAME AS "105",
cfiles.last_modified AS "107",
pg_size_pretty ( cfiles.SIZE :: BIGINT ) AS "106",
cfiles.ID AS "101",
cfiles.tid AS "102",
cfiles.uuid AS "103",
cfiles.PATH AS "104",
cfiles.content_type AS "108",
cfiles.locked AS "109",
cfiles.checksum AS "110"
FROM
cfiles
JOIN i USING(id) -- should match just 20 records
JOIN datasets ON datasets.ID = cfiles.dataset_id
LEFT JOIN user_permissions ON ( user_permissions.cfile_id = cfiles.ID OR user_permissions.dataset_id = datasets.ID )
LEFT JOIN users ON users.ID = user_permissions.user_id
LEFT JOIN group_permissions ON ( group_permissions.cfile_id = cfiles.ID OR group_permissions.dataset_id = datasets.ID )
LEFT JOIN groups ON groups.ID = group_permissions.group_id
LEFT JOIN user_groups ON groups.ID = user_groups.group_id
LEFT JOIN picklist_cfiles ON picklist_cfiles.cfile_id = cfiles.ID
ORDER BY
"107" DESC;
{
示例名称:[最多200个短字符串…],
项目ID:[最多10个短字符串…],
运行ID:[最多10个短字符串…],
数据类型:[最多10个短字符串…]
}
根据回答,我尝试在下面添加一个GIN索引,但它对下面注释中的运行时几乎没有影响,我假设这是因为我的查询不是使用@>运算符的纯json,而是与关系查询相结合的
使用GIN属性值jsonb路径在cfiles上创建索引;
我对获取整个列与查询顶级json键之间的巨大差异感到惊讶。在这一点上,将整个jsonb列作为一个字符串提取,并将其拆分为逗号和引号似乎更为有效,这是我希望避免的一种攻击
我的目标是把{样本名称}定为1907年,
-添加另一条路径:GIN=4028ms
cfiles.property_值>>“{Project id}”为1908,
-添加另一条路径:GIN=4774ms
cfiles.property_值>>“{Run IDs}”为1909,
-添加另一条路径:GIN=5558ms
cfiles.property_值>>“{Data Type}”为1910,
-==以下查询的其余部分我无法更改====
用户权限。通知地址为::text as 111,
组权限。通知地址为::text as 112,
用户权限。任务id::文本为113,
组权限。任务id::文本为114,
datasets.id为151,
datasets.name为154,
datasets.path as 155,
数据集。最后修改为156,
datasets.file_计数为157,
数据集。锁定为158,
数据集。内容类型为159,
cfiles.name as 105,
cfiles.last_修改为107,
pg_size_prettycfiles.size::bigint as 106,
cfiles.id as 101,
cfiles.tid as 102,
cfiles.uuid as 103,
cfiles.path as 104,
cfiles.content_类型为108,
c文件。锁定为109,
cfiles.checksum为110
来自cfiles
连接dataset.id=cfiles.dataset\u id上的数据集
在用户权限上的左加入用户权限。cfile\u id=cfiles.id或用户权限。dataset\u id=datasets.id
在用户上左加入用户。id=用户权限。用户id
组权限上的左加入组权限。cfile\U id=cfiles.id或组权限。dataset\U id=datasets.id
在组上左加入组。id=组\权限。组\ id
左键加入组上的用户组。id=用户组。组id
左连接拾取列表文件上的拾取列表文件。拾取文件id=cfiles.id
哪里
cfiles.tid=5
按107描述订购
限制20
偏移量0
慢速查询计划:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=13700.06..13700.11 rows=20 width=662) (actual time=5702.511..5702.521 rows=20 loops=1)
Output: ((cfiles.property_values #>> '{"Sample Names"}'::text[])), ((cfiles.property_values #>> '{"Project IDs"}'::text[])), ((cfiles.property_values #>> '{"Run IDs"}'::text[])), ((cfiles.property_values #>> '{"Data Type"}'::text[])), ((user_permissions.notified_at)::text), ((group_permissions.notified_at)::text), ((user_permissions.task_id)::text), ((group_permissions.task_id)::text), datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, (pg_size_pretty(cfiles.size)), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
-> Sort (cost=13700.06..13810.61 rows=44219 width=662) (actual time=5702.508..5702.512 rows=20 loops=1)
Output: ((cfiles.property_values #>> '{"Sample Names"}'::text[])), ((cfiles.property_values #>> '{"Project IDs"}'::text[])), ((cfiles.property_values #>> '{"Run IDs"}'::text[])), ((cfiles.property_values #>> '{"Data Type"}'::text[])), ((user_permissions.notified_at)::text), ((group_permissions.notified_at)::text), ((user_permissions.task_id)::text), ((group_permissions.task_id)::text), datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, (pg_size_pretty(cfiles.size)), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
Sort Key: cfiles.last_modified DESC
Sort Method: top-N heapsort Memory: 344kB
-> Hash Left Join (cost=39.53..12523.41 rows=44219 width=662) (actual time=2.535..5526.409 rows=44255 loops=1)
Output: (cfiles.property_values #>> '{"Sample Names"}'::text[]), (cfiles.property_values #>> '{"Project IDs"}'::text[]), (cfiles.property_values #>> '{"Run IDs"}'::text[]), (cfiles.property_values #>> '{"Data Type"}'::text[]), (user_permissions.notified_at)::text, (group_permissions.notified_at)::text, (user_permissions.task_id)::text, (group_permissions.task_id)::text, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, pg_size_pretty(cfiles.size), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
Hash Cond: (cfiles.id = picklist_cfiles.cfile_id)
-> Nested Loop Left Join (cost=38.19..10918.99 rows=44219 width=867) (actual time=1.639..632.739 rows=44255 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, user_permissions.notified_at, user_permissions.task_id, group_permissions.notified_at, group_permissions.task_id
Join Filter: ((user_permissions.cfile_id = cfiles.id) OR (user_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 177020
-> Nested Loop Left Join (cost=38.19..7822.61 rows=44219 width=851) (actual time=1.591..464.449 rows=44255 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, group_permissions.notified_at, group_permissions.task_id
Join Filter: ((group_permissions.cfile_id = cfiles.id) OR (group_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 354040
-> Hash Join (cost=35.75..4723.32 rows=44219 width=835) (actual time=1.301..163.411 rows=44255 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
Inner Unique: true
Hash Cond: (cfiles.dataset_id = datasets.id)
-> Seq Scan on public.cfiles (cost=0.00..4570.70 rows=44219 width=644) (actual time=0.044..49.425 rows=44255 loops=1)
Output: cfiles.id, cfiles.tid, cfiles.uuid, cfiles.dataset_id, cfiles.path, cfiles.name, cfiles.checksum, cfiles.size, cfiles.last_modified, cfiles.content_type, cfiles.locked, cfiles.property_values, cfiles.created_at, cfiles.updated_at
Filter: (cfiles.tid = 5)
Rows Removed by Filter: 1561
-> Hash (cost=28.11..28.11 rows=611 width=199) (actual time=1.234..1.235 rows=611 loops=1)
Output: datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
Buckets: 1024 Batches: 1 Memory Usage: 149kB
-> Seq Scan on public.datasets (cost=0.00..28.11 rows=611 width=199) (actual time=0.012..0.571 rows=611 loops=1)
Output: datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
-> Materialize (cost=2.44..3.97 rows=4 width=32) (actual time=0.000..0.002 rows=8 loops=44255)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
-> Hash Right Join (cost=2.44..3.95 rows=4 width=32) (actual time=0.170..0.248 rows=8 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
Hash Cond: (user_groups.group_id = groups.id)
-> Seq Scan on public.user_groups (cost=0.00..1.34 rows=34 width=8) (actual time=0.022..0.056 rows=34 loops=1)
Output: user_groups.id, user_groups.tid, user_groups.user_id, user_groups.group_id, user_groups.created_at, user_groups.updated_at
-> Hash (cost=2.39..2.39 rows=4 width=40) (actual time=0.121..0.121 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Right Join (cost=1.09..2.39 rows=4 width=40) (actual time=0.063..0.092 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Hash Cond: (groups.id = group_permissions.group_id)
-> Seq Scan on public.groups (cost=0.00..1.19 rows=19 width=8) (actual time=0.010..0.017 rows=19 loops=1)
Output: groups.id, groups.tid, groups.name, groups.description, groups.default_uview, groups.created_at, groups.updated_at
-> Hash (cost=1.04..1.04 rows=4 width=40) (actual time=0.032..0.033 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.group_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.017..0.022 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
-> Materialize (cost=0.00..1.06 rows=4 width=40) (actual time=0.000..0.001 rows=4 loops=44255)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
-> Seq Scan on public.user_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.021..0.025 rows=4 loops=1)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
-> Hash (cost=1.15..1.15 rows=15 width=8) (actual time=0.040..0.040 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.picklist_cfiles (cost=0.00..1.15 rows=15 width=8) (actual time=0.010..0.017 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
Planning Time: 3.141 ms
Execution Time: 5702.799 ms
(61 rows)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=16.18..16.23 rows=20 width=662) (actual time=18.771..18.779 rows=20 loops=1)
Output: ((t.prop_vals ->> 'Sample Names'::text)), ((t.prop_vals ->> 'Project IDs'::text)), ((t.prop_vals ->> 'Run IDs'::text)), ((t.prop_vals ->> 'Data Type'::text)), t."111", t."112", t."113", t."114", t."151", t."154", t."155", t."156", t."157", t."158", t."159", t."105", t."107", t."106", t."101", t."102", t."103", t."104", t."108", t."109", t."110"
Sort Key: t."107" DESC
Sort Method: quicksort Memory: 368kB
-> Subquery Scan on t (cost=4.05..15.74 rows=20 width=662) (actual time=1.091..18.412 rows=20 loops=1)
Output: (t.prop_vals ->> 'Sample Names'::text), (t.prop_vals ->> 'Project IDs'::text), (t.prop_vals ->> 'Run IDs'::text), (t.prop_vals ->> 'Data Type'::text), t."111", t."112", t."113", t."114", t."151", t."154", t."155", t."156", t."157", t."158", t."159", t."105", t."107", t."106", t."101", t."102", t."103", t."104", t."108", t."109", t."110"
-> Limit (cost=4.05..15.34 rows=20 width=987) (actual time=0.320..1.241 rows=20 loops=1)
Output: cfiles.property_values, ((user_permissions.notified_at)::text), ((group_permissions.notified_at)::text), ((user_permissions.task_id)::text), ((group_permissions.task_id)::text), datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, (pg_size_pretty(cfiles.size)), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
-> Nested Loop Left Join (cost=4.05..24965.23 rows=44219 width=987) (actual time=0.318..1.224 rows=20 loops=1)
Output: cfiles.property_values, (user_permissions.notified_at)::text, (group_permissions.notified_at)::text, (user_permissions.task_id)::text, (group_permissions.task_id)::text, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, pg_size_pretty(cfiles.size), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
Join Filter: ((user_permissions.cfile_id = cfiles.id) OR (user_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 80
-> Nested Loop Left Join (cost=4.05..20873.92 rows=44219 width=851) (actual time=0.273..1.056 rows=20 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, group_permissions.notified_at, group_permissions.task_id
Join Filter: ((group_permissions.cfile_id = cfiles.id) OR (group_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 160
-> Nested Loop (cost=1.61..17774.63 rows=44219 width=835) (actual time=0.125..0.745 rows=20 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
Inner Unique: true
-> Hash Left Join (cost=1.34..4738.00 rows=44219 width=644) (actual time=0.094..0.475 rows=20 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, cfiles.dataset_id
Hash Cond: (cfiles.id = picklist_cfiles.cfile_id)
-> Seq Scan on public.cfiles (cost=0.00..4570.70 rows=44219 width=644) (actual time=0.046..0.360 rows=20 loops=1)
Output: cfiles.id, cfiles.tid, cfiles.uuid, cfiles.dataset_id, cfiles.path, cfiles.name, cfiles.checksum, cfiles.size, cfiles.last_modified, cfiles.content_type, cfiles.locked, cfiles.property_values, cfiles.created_at, cfiles.updated_at
Filter: (cfiles.tid = 5)
Rows Removed by Filter: 629
-> Hash (cost=1.15..1.15 rows=15 width=8) (actual time=0.034..0.035 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.picklist_cfiles (cost=0.00..1.15 rows=15 width=8) (actual time=0.010..0.018 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
-> Index Scan using datasets_pkey on public.datasets (cost=0.28..0.29 rows=1 width=199) (actual time=0.008..0.008 rows=1 loops=20)
Output: datasets.id, datasets.tid, datasets.bucket_path_id, datasets.path, datasets.name, datasets.last_modified, datasets.file_count, datasets.size, datasets.content_types, datasets.locked, datasets.created_at, datasets.updated_at
Index Cond: (datasets.id = cfiles.dataset_id)
-> Materialize (cost=2.44..3.97 rows=4 width=32) (actual time=0.005..0.009 rows=8 loops=20)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
-> Hash Right Join (cost=2.44..3.95 rows=4 width=32) (actual time=0.088..0.122 rows=8 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
Hash Cond: (user_groups.group_id = groups.id)
-> Seq Scan on public.user_groups (cost=0.00..1.34 rows=34 width=8) (actual time=0.007..0.016 rows=34 loops=1)
Output: user_groups.id, user_groups.tid, user_groups.user_id, user_groups.group_id, user_groups.created_at, user_groups.updated_at
-> Hash (cost=2.39..2.39 rows=4 width=40) (actual time=0.069..0.069 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Right Join (cost=1.09..2.39 rows=4 width=40) (actual time=0.043..0.064 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Hash Cond: (groups.id = group_permissions.group_id)
-> Seq Scan on public.groups (cost=0.00..1.19 rows=19 width=8) (actual time=0.006..0.011 rows=19 loops=1)
Output: groups.id, groups.tid, groups.name, groups.description, groups.default_uview, groups.created_at, groups.updated_at
-> Hash (cost=1.04..1.04 rows=4 width=40) (actual time=0.022..0.022 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.group_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.009..0.014 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
-> Materialize (cost=0.00..1.06 rows=4 width=40) (actual time=0.001..0.003 rows=4 loops=20)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
-> Seq Scan on public.user_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.018..0.022 rows=4 loops=1)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
Planning Time: 4.049 ms
Execution Time: 19.128 ms
(60 rows)
更新:重构到CTE模式使我的时间降到了20毫秒
以T作为
选择cfiles.property\u值作为属性值,
用户权限。通知地址为::text as 111,
组权限。通知地址为::text as 112,
用户权限。任务id::文本为113,
组权限。任务id::文本为114,
datasets.id为151,
datasets.name为154,
datasets.path as 155,
数据集。最后修改为156,
datasets.file_计数为157,
数据集。锁定为158,
数据集。内容类型为159,
cfiles.name as 105,
cfiles.last_修改为107,
pg_size_prettycfiles.size::bigint as 106,
cfiles.id as 101,
cfiles.tid as 102,
cfiles.uuid as 103,
cfiles.path as 104,
cfiles.content_类型为108,
c文件。锁定为109,
cfiles.checksum为110
来自cfiles
连接dataset.id=cfiles.dataset\u id上的数据集
在用户权限上的左加入用户权限。cfile\u id=cfiles.id或用户权限。dataset\u id=datasets.id
在用户上左加入用户。id=用户权限。用户id
组权限上的左加入组权限。cfile\U id=cfiles.id或组权限。dataset\U id=datasets.id
在组上左加入组。id=组\权限。组\ id
左键加入组上的用户组。id=用户组。组id
左连接拾取列表文件上的拾取列表文件。拾取文件id=cfiles.id
哪里
cfiles.tid=5
限制20
选择
项目名称->>“样本名称”为1907年,
项目编号为1908的项目编号,
项目VAL->>“运行ID”为1909,
属性VAL->>“数据类型”为1910,
111, 112, 113, 114, 151, 154, 155, 156, 157,
158, 159, 105, 107, 106, 101, 102, 103, 104,
108, 109, 110
从T
107 desc命令;
CTE查询计划:
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=13700.06..13700.11 rows=20 width=662) (actual time=5702.511..5702.521 rows=20 loops=1)
Output: ((cfiles.property_values #>> '{"Sample Names"}'::text[])), ((cfiles.property_values #>> '{"Project IDs"}'::text[])), ((cfiles.property_values #>> '{"Run IDs"}'::text[])), ((cfiles.property_values #>> '{"Data Type"}'::text[])), ((user_permissions.notified_at)::text), ((group_permissions.notified_at)::text), ((user_permissions.task_id)::text), ((group_permissions.task_id)::text), datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, (pg_size_pretty(cfiles.size)), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
-> Sort (cost=13700.06..13810.61 rows=44219 width=662) (actual time=5702.508..5702.512 rows=20 loops=1)
Output: ((cfiles.property_values #>> '{"Sample Names"}'::text[])), ((cfiles.property_values #>> '{"Project IDs"}'::text[])), ((cfiles.property_values #>> '{"Run IDs"}'::text[])), ((cfiles.property_values #>> '{"Data Type"}'::text[])), ((user_permissions.notified_at)::text), ((group_permissions.notified_at)::text), ((user_permissions.task_id)::text), ((group_permissions.task_id)::text), datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, (pg_size_pretty(cfiles.size)), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
Sort Key: cfiles.last_modified DESC
Sort Method: top-N heapsort Memory: 344kB
-> Hash Left Join (cost=39.53..12523.41 rows=44219 width=662) (actual time=2.535..5526.409 rows=44255 loops=1)
Output: (cfiles.property_values #>> '{"Sample Names"}'::text[]), (cfiles.property_values #>> '{"Project IDs"}'::text[]), (cfiles.property_values #>> '{"Run IDs"}'::text[]), (cfiles.property_values #>> '{"Data Type"}'::text[]), (user_permissions.notified_at)::text, (group_permissions.notified_at)::text, (user_permissions.task_id)::text, (group_permissions.task_id)::text, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, pg_size_pretty(cfiles.size), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
Hash Cond: (cfiles.id = picklist_cfiles.cfile_id)
-> Nested Loop Left Join (cost=38.19..10918.99 rows=44219 width=867) (actual time=1.639..632.739 rows=44255 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, user_permissions.notified_at, user_permissions.task_id, group_permissions.notified_at, group_permissions.task_id
Join Filter: ((user_permissions.cfile_id = cfiles.id) OR (user_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 177020
-> Nested Loop Left Join (cost=38.19..7822.61 rows=44219 width=851) (actual time=1.591..464.449 rows=44255 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, group_permissions.notified_at, group_permissions.task_id
Join Filter: ((group_permissions.cfile_id = cfiles.id) OR (group_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 354040
-> Hash Join (cost=35.75..4723.32 rows=44219 width=835) (actual time=1.301..163.411 rows=44255 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
Inner Unique: true
Hash Cond: (cfiles.dataset_id = datasets.id)
-> Seq Scan on public.cfiles (cost=0.00..4570.70 rows=44219 width=644) (actual time=0.044..49.425 rows=44255 loops=1)
Output: cfiles.id, cfiles.tid, cfiles.uuid, cfiles.dataset_id, cfiles.path, cfiles.name, cfiles.checksum, cfiles.size, cfiles.last_modified, cfiles.content_type, cfiles.locked, cfiles.property_values, cfiles.created_at, cfiles.updated_at
Filter: (cfiles.tid = 5)
Rows Removed by Filter: 1561
-> Hash (cost=28.11..28.11 rows=611 width=199) (actual time=1.234..1.235 rows=611 loops=1)
Output: datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
Buckets: 1024 Batches: 1 Memory Usage: 149kB
-> Seq Scan on public.datasets (cost=0.00..28.11 rows=611 width=199) (actual time=0.012..0.571 rows=611 loops=1)
Output: datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
-> Materialize (cost=2.44..3.97 rows=4 width=32) (actual time=0.000..0.002 rows=8 loops=44255)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
-> Hash Right Join (cost=2.44..3.95 rows=4 width=32) (actual time=0.170..0.248 rows=8 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
Hash Cond: (user_groups.group_id = groups.id)
-> Seq Scan on public.user_groups (cost=0.00..1.34 rows=34 width=8) (actual time=0.022..0.056 rows=34 loops=1)
Output: user_groups.id, user_groups.tid, user_groups.user_id, user_groups.group_id, user_groups.created_at, user_groups.updated_at
-> Hash (cost=2.39..2.39 rows=4 width=40) (actual time=0.121..0.121 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Right Join (cost=1.09..2.39 rows=4 width=40) (actual time=0.063..0.092 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Hash Cond: (groups.id = group_permissions.group_id)
-> Seq Scan on public.groups (cost=0.00..1.19 rows=19 width=8) (actual time=0.010..0.017 rows=19 loops=1)
Output: groups.id, groups.tid, groups.name, groups.description, groups.default_uview, groups.created_at, groups.updated_at
-> Hash (cost=1.04..1.04 rows=4 width=40) (actual time=0.032..0.033 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.group_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.017..0.022 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
-> Materialize (cost=0.00..1.06 rows=4 width=40) (actual time=0.000..0.001 rows=4 loops=44255)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
-> Seq Scan on public.user_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.021..0.025 rows=4 loops=1)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
-> Hash (cost=1.15..1.15 rows=15 width=8) (actual time=0.040..0.040 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.picklist_cfiles (cost=0.00..1.15 rows=15 width=8) (actual time=0.010..0.017 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
Planning Time: 3.141 ms
Execution Time: 5702.799 ms
(61 rows)
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=16.18..16.23 rows=20 width=662) (actual time=18.771..18.779 rows=20 loops=1)
Output: ((t.prop_vals ->> 'Sample Names'::text)), ((t.prop_vals ->> 'Project IDs'::text)), ((t.prop_vals ->> 'Run IDs'::text)), ((t.prop_vals ->> 'Data Type'::text)), t."111", t."112", t."113", t."114", t."151", t."154", t."155", t."156", t."157", t."158", t."159", t."105", t."107", t."106", t."101", t."102", t."103", t."104", t."108", t."109", t."110"
Sort Key: t."107" DESC
Sort Method: quicksort Memory: 368kB
-> Subquery Scan on t (cost=4.05..15.74 rows=20 width=662) (actual time=1.091..18.412 rows=20 loops=1)
Output: (t.prop_vals ->> 'Sample Names'::text), (t.prop_vals ->> 'Project IDs'::text), (t.prop_vals ->> 'Run IDs'::text), (t.prop_vals ->> 'Data Type'::text), t."111", t."112", t."113", t."114", t."151", t."154", t."155", t."156", t."157", t."158", t."159", t."105", t."107", t."106", t."101", t."102", t."103", t."104", t."108", t."109", t."110"
-> Limit (cost=4.05..15.34 rows=20 width=987) (actual time=0.320..1.241 rows=20 loops=1)
Output: cfiles.property_values, ((user_permissions.notified_at)::text), ((group_permissions.notified_at)::text), ((user_permissions.task_id)::text), ((group_permissions.task_id)::text), datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, (pg_size_pretty(cfiles.size)), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
-> Nested Loop Left Join (cost=4.05..24965.23 rows=44219 width=987) (actual time=0.318..1.224 rows=20 loops=1)
Output: cfiles.property_values, (user_permissions.notified_at)::text, (group_permissions.notified_at)::text, (user_permissions.task_id)::text, (group_permissions.task_id)::text, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, cfiles.name, cfiles.last_modified, pg_size_pretty(cfiles.size), cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum
Join Filter: ((user_permissions.cfile_id = cfiles.id) OR (user_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 80
-> Nested Loop Left Join (cost=4.05..20873.92 rows=44219 width=851) (actual time=0.273..1.056 rows=20 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types, group_permissions.notified_at, group_permissions.task_id
Join Filter: ((group_permissions.cfile_id = cfiles.id) OR (group_permissions.dataset_id = datasets.id))
Rows Removed by Join Filter: 160
-> Nested Loop (cost=1.61..17774.63 rows=44219 width=835) (actual time=0.125..0.745 rows=20 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, datasets.id, datasets.name, datasets.path, datasets.last_modified, datasets.file_count, datasets.locked, datasets.content_types
Inner Unique: true
-> Hash Left Join (cost=1.34..4738.00 rows=44219 width=644) (actual time=0.094..0.475 rows=20 loops=1)
Output: cfiles.property_values, cfiles.name, cfiles.last_modified, cfiles.size, cfiles.id, cfiles.tid, cfiles.uuid, cfiles.path, cfiles.content_type, cfiles.locked, cfiles.checksum, cfiles.dataset_id
Hash Cond: (cfiles.id = picklist_cfiles.cfile_id)
-> Seq Scan on public.cfiles (cost=0.00..4570.70 rows=44219 width=644) (actual time=0.046..0.360 rows=20 loops=1)
Output: cfiles.id, cfiles.tid, cfiles.uuid, cfiles.dataset_id, cfiles.path, cfiles.name, cfiles.checksum, cfiles.size, cfiles.last_modified, cfiles.content_type, cfiles.locked, cfiles.property_values, cfiles.created_at, cfiles.updated_at
Filter: (cfiles.tid = 5)
Rows Removed by Filter: 629
-> Hash (cost=1.15..1.15 rows=15 width=8) (actual time=0.034..0.035 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.picklist_cfiles (cost=0.00..1.15 rows=15 width=8) (actual time=0.010..0.018 rows=15 loops=1)
Output: picklist_cfiles.cfile_id
-> Index Scan using datasets_pkey on public.datasets (cost=0.28..0.29 rows=1 width=199) (actual time=0.008..0.008 rows=1 loops=20)
Output: datasets.id, datasets.tid, datasets.bucket_path_id, datasets.path, datasets.name, datasets.last_modified, datasets.file_count, datasets.size, datasets.content_types, datasets.locked, datasets.created_at, datasets.updated_at
Index Cond: (datasets.id = cfiles.dataset_id)
-> Materialize (cost=2.44..3.97 rows=4 width=32) (actual time=0.005..0.009 rows=8 loops=20)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
-> Hash Right Join (cost=2.44..3.95 rows=4 width=32) (actual time=0.088..0.122 rows=8 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id
Hash Cond: (user_groups.group_id = groups.id)
-> Seq Scan on public.user_groups (cost=0.00..1.34 rows=34 width=8) (actual time=0.007..0.016 rows=34 loops=1)
Output: user_groups.id, user_groups.tid, user_groups.user_id, user_groups.group_id, user_groups.created_at, user_groups.updated_at
-> Hash (cost=2.39..2.39 rows=4 width=40) (actual time=0.069..0.069 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Right Join (cost=1.09..2.39 rows=4 width=40) (actual time=0.043..0.064 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, groups.id
Hash Cond: (groups.id = group_permissions.group_id)
-> Seq Scan on public.groups (cost=0.00..1.19 rows=19 width=8) (actual time=0.006..0.011 rows=19 loops=1)
Output: groups.id, groups.tid, groups.name, groups.description, groups.default_uview, groups.created_at, groups.updated_at
-> Hash (cost=1.04..1.04 rows=4 width=40) (actual time=0.022..0.022 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on public.group_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.009..0.014 rows=4 loops=1)
Output: group_permissions.notified_at, group_permissions.task_id, group_permissions.cfile_id, group_permissions.dataset_id, group_permissions.group_id
-> Materialize (cost=0.00..1.06 rows=4 width=40) (actual time=0.001..0.003 rows=4 loops=20)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
-> Seq Scan on public.user_permissions (cost=0.00..1.04 rows=4 width=40) (actual time=0.018..0.022 rows=4 loops=1)
Output: user_permissions.notified_at, user_permissions.task_id, user_permissions.cfile_id, user_permissions.dataset_id, user_permissions.user_id
Planning Time: 4.049 ms
Execution Time: 19.128 ms
(60 rows)
缓慢的查询是对所有44255行的大型jsonb数据进行分解,然后通过排序执行解析出的值,以挑选出前20行。我不知道为什么它会这样急切地去做。因此,44235 JSONB被销毁,只是为了扔掉 您的快速查询大概是从散列联接返回TOAST指针,用这些小指针对行进行排序,然后取消TOAST 只有20名幸存者。在解释分析的例子中,它甚至不去分析幸存者,它只是把指针扔掉了 这就是为什么,至于如何处理它,如果你真的不能修改最上面部分下面的任何查询,我怀疑你在服务器端能做些什么 如果可以更大幅度地修改查询,那么可以使用CTE改进运行时间。让CTE选择整个jsonb,然后CTE上的select从中提取值
WITH T as (select cfiles.property_values as "1907", <rest of query>)
SELECT "1907"->>'name1', "1907"->>'name2', <rest of select list> from T;
缓慢的查询是对所有44255行的大型jsonb数据进行分解,然后通过排序执行解析出的值,以挑选出前20行。我不知道为什么它会这样急切地去做。因此,44235 JSONB被销毁,只是为了扔掉 您的快速查询大概是从散列联接返回TOAST指针,用这些小指针对行进行排序,然后仅对20个幸存的行进行detoast。在解释分析的例子中,它甚至不去分析幸存者,它只是把指针扔掉了 这就是为什么,至于如何处理它,如果你真的不能修改最上面部分下面的任何查询,我怀疑你在服务器端能做些什么 如果可以更大幅度地修改查询,那么可以使用CTE改进运行时间。让CTE选择整个jsonb,然后CTE上的select从中提取值
WITH T as (select cfiles.property_values as "1907", <rest of query>)
SELECT "1907"->>'name1', "1907"->>'name2', <rest of select list> from T;
除了@jjanes已经说过的,您可以先将记录的数量限制在20条,然后再完成其余的工作。大概是这样的:
WITH i(id) AS (
-- core piece of SQL to select the records you're looking for
SELECT
cfiles.ID
FROM
cfiles
JOIN datasets ON datasets.ID = cfiles.dataset_id
WHERE
cfiles.tid = 5
ORDER BY
cfiles.last_modified DESC
LIMIT 20 OFFSET 0
)
SELECT-- FAST OPTION: getting all of json: no GIN=579ms; with GIN=574ms
cfiles.property_values AS "1907",
-- == vs ==
-- SLOW OPTION: getting a json path: no GIN=3273ms; with GIN=3241ms
cfiles.property_values #>> '{"Sample Names"}' AS "1907",
-- adding another path: with GIN=4028ms
cfiles.property_values #>> '{"Project IDs"}' AS "1908",
-- adding yet another path: with GIN=4774ms
cfiles.property_values #>> '{"Run IDs"}' AS "1909",
-- adding yet another path: with GIN=5558ms
cfiles.property_values #>> '{"Data Type"}' AS "1910",
-- ==== rest of query below I can't change ====
user_permissions.notified_at :: TEXT AS "101",
group_permissions.notified_at :: TEXT AS "102",
user_permissions.task_id :: TEXT AS "103",
group_permissions.task_id :: TEXT AS "104",
datasets.ID AS "151",
datasets.NAME AS "154",
datasets.PATH AS "155",
datasets.last_modified AS "156",
datasets.file_count AS "157",
datasets.locked AS "158",
datasets.content_types AS "159",
cfiles.NAME AS "105",
cfiles.last_modified AS "107",
pg_size_pretty ( cfiles.SIZE :: BIGINT ) AS "106",
cfiles.ID AS "101",
cfiles.tid AS "102",
cfiles.uuid AS "103",
cfiles.PATH AS "104",
cfiles.content_type AS "108",
cfiles.locked AS "109",
cfiles.checksum AS "110"
FROM
cfiles
JOIN i USING(id) -- should match just 20 records
JOIN datasets ON datasets.ID = cfiles.dataset_id
LEFT JOIN user_permissions ON ( user_permissions.cfile_id = cfiles.ID OR user_permissions.dataset_id = datasets.ID )
LEFT JOIN users ON users.ID = user_permissions.user_id
LEFT JOIN group_permissions ON ( group_permissions.cfile_id = cfiles.ID OR group_permissions.dataset_id = datasets.ID )
LEFT JOIN groups ON groups.ID = group_permissions.group_id
LEFT JOIN user_groups ON groups.ID = user_groups.group_id
LEFT JOIN picklist_cfiles ON picklist_cfiles.cfile_id = cfiles.ID
ORDER BY
"107" DESC;
您可能希望重写具有OR条件的两个左联接,您可以使用UNION ALL的子查询。除了@jjanes已经说过的,你可以先把记录的数量限制在20条,然后再做剩下的工作。大概是这样的:
WITH i(id) AS (
-- core piece of SQL to select the records you're looking for
SELECT
cfiles.ID
FROM
cfiles
JOIN datasets ON datasets.ID = cfiles.dataset_id
WHERE
cfiles.tid = 5
ORDER BY
cfiles.last_modified DESC
LIMIT 20 OFFSET 0
)
SELECT-- FAST OPTION: getting all of json: no GIN=579ms; with GIN=574ms
cfiles.property_values AS "1907",
-- == vs ==
-- SLOW OPTION: getting a json path: no GIN=3273ms; with GIN=3241ms
cfiles.property_values #>> '{"Sample Names"}' AS "1907",
-- adding another path: with GIN=4028ms
cfiles.property_values #>> '{"Project IDs"}' AS "1908",
-- adding yet another path: with GIN=4774ms
cfiles.property_values #>> '{"Run IDs"}' AS "1909",
-- adding yet another path: with GIN=5558ms
cfiles.property_values #>> '{"Data Type"}' AS "1910",
-- ==== rest of query below I can't change ====
user_permissions.notified_at :: TEXT AS "101",
group_permissions.notified_at :: TEXT AS "102",
user_permissions.task_id :: TEXT AS "103",
group_permissions.task_id :: TEXT AS "104",
datasets.ID AS "151",
datasets.NAME AS "154",
datasets.PATH AS "155",
datasets.last_modified AS "156",
datasets.file_count AS "157",
datasets.locked AS "158",
datasets.content_types AS "159",
cfiles.NAME AS "105",
cfiles.last_modified AS "107",
pg_size_pretty ( cfiles.SIZE :: BIGINT ) AS "106",
cfiles.ID AS "101",
cfiles.tid AS "102",
cfiles.uuid AS "103",
cfiles.PATH AS "104",
cfiles.content_type AS "108",
cfiles.locked AS "109",
cfiles.checksum AS "110"
FROM
cfiles
JOIN i USING(id) -- should match just 20 records
JOIN datasets ON datasets.ID = cfiles.dataset_id
LEFT JOIN user_permissions ON ( user_permissions.cfile_id = cfiles.ID OR user_permissions.dataset_id = datasets.ID )
LEFT JOIN users ON users.ID = user_permissions.user_id
LEFT JOIN group_permissions ON ( group_permissions.cfile_id = cfiles.ID OR group_permissions.dataset_id = datasets.ID )
LEFT JOIN groups ON groups.ID = group_permissions.group_id
LEFT JOIN user_groups ON groups.ID = user_groups.group_id
LEFT JOIN picklist_cfiles ON picklist_cfiles.cfile_id = cfiles.ID
ORDER BY
"107" DESC;
您可能希望重写具有OR条件的两个左联接,您可以使用UNION ALL的子查询。这可能会使事情加快一点@LaurenzAlbe感谢您的关注-是的,我尝试了->和->>操作符,没有任何区别。刚刚在上面添加了快速执行计划。@LaurenzAlbe感谢您的关注-是的,我尝试了->和->>操作符,没有任何区别。刚刚在上面添加了快速执行计划。可能问题在于旧的PostgreSQL版本。从9.6开始,选择列表条目在排序后进行评估。@jjanes感谢我没有听说过TOAST-如果我能够编辑任何/所有查询,有没有办法只删除幸存者,但仍将他们分配给列名?@LaurenzAlbe刚刚更新-这是PostgreSQL版本12@jjanes谢谢-使用CTE格式使我只需对原始的queryPerhaps进行最小的更改,就可以使用17毫秒。问题在于旧的PostgreSQL版本。从9.6开始,选择列表条目在排序后进行评估。@jjanes感谢我没有听说过TOAST-如果我能够编辑任何/所有查询,有没有办法只删除幸存者,但仍将他们分配给列名?@LaurenzAlbe刚刚更新-这是PostgreSQL版本12@jjanes谢谢-使用CTE格式将我的查询时间降到了17毫秒,对原始查询的更改很少。如何将其降到60毫秒谢谢-将不得不尝试重新处理整个查询query@simj你能给我们看看吗解释分析的结果?总是有兴趣看到数据库在做什么哇,这使它下降到60毫秒谢谢-将不得不尝试重新工作整个数据库query@simj:你能给我们看一下分析的结果吗?查看数据库正在做什么总是很有趣的