Postgresql 从postgres临时表中删除的时间差异很大
我有一个查询需要很长时间才能运行,所以我重新编写了它,现在几乎不需要任何时间就可以运行了——但我不明白为什么 我能理解一个小的差异,但是有人能解释一下运行这两个(看起来非常相似)语句所花费的时间上的巨大差异吗 第一:Postgresql 从postgres临时表中删除的时间差异很大,postgresql,Postgresql,我有一个查询需要很长时间才能运行,所以我重新编写了它,现在几乎不需要任何时间就可以运行了——但我不明白为什么 我能理解一个小的差异,但是有人能解释一下运行这两个(看起来非常相似)语句所花费的时间上的巨大差异吗 第一: DELETE FROM t_old where company_id not in (select company_id from t_prop); 第二: DELETE FROM t_old a using t_prop b where
DELETE FROM t_old where company_id not in (select company_id from t_prop);
第二:
DELETE FROM t_old a
using t_prop b
where a.company_id=b.company_id
and b.company_id is null;
第一阶段的执行计划:
'[
{
"Plan": {
"Startup Cost": 0,
"Plans": [
{
"Filter": "(NOT (SubPlan 1))",
"Startup Cost": 0,
"Plans": [
{
"Startup Cost": 0,
"Plans": [
{
"Startup Cost": 0,
"Node Type": "Seq Scan",
"Plan Rows": 158704,
"Relation Name": "t_prop",
"Alias": "t_prop",
"Parallel Aware": false,
"Parent Relationship": "Outer",
"Plan Width": 4,
"Total Cost": 2598.04
}
],
"Node Type": "Materialize",
"Plan Rows": 158704,
"Parallel Aware": false,
"Parent Relationship": "SubPlan",
"Plan Width": 4,
"Subplan Name": "SubPlan 1",
"Total Cost": 4011.56
}
],
"Node Type": "Seq Scan",
"Plan Rows": 21760,
"Relation Name": "t_old",
"Alias": "t_old",
"Parallel Aware": false,
"Parent Relationship": "Member",
"Plan Width": 6,
"Total Cost": 95923746.03
}
],
"Node Type": "ModifyTable",
"Plan Rows": 21760,
"Relation Name": "t_old",
"Alias": "t_old",
"Parallel Aware": false,
"Operation": "Delete",
"Plan Width": 6,
"Total Cost": 95923746.03
}
}
]"
第二阶段的执行计划
'[
{
"Plan": {
"Startup Cost": 0.71,
"Plans": [
{
"Startup Cost": 0.71,
"Plans": [
{
"Startup Cost": 0.42,
"Scan Direction": "Forward",
"Plan Width": 10,
"Node Type": "Index Scan",
"Index Cond": "(company_id IS NULL)",
"Plan Rows": 1,
"Relation Name": "t_prop",
"Alias": "b",
"Parallel Aware": false,
"Parent Relationship": "Outer",
"Total Cost": 8.44,
"Index Name": "t_prop_idx2"
},
{
"Startup Cost": 0.29,
"Scan Direction": "Forward",
"Plan Width": 10,
"Node Type": "Index Scan",
"Index Cond": "(company_id = b.company_id)",
"Plan Rows": 5,
"Relation Name": "t_old",
"Alias": "a",
"Parallel Aware": false,
"Parent Relationship": "Inner",
"Total Cost": 8.38,
"Index Name": "t_old_idx"
}
],
"Node Type": "Nested Loop",
"Plan Rows": 5,
"Join Type": "Inner",
"Parallel Aware": false,
"Parent Relationship": "Member",
"Plan Width": 12,
"Total Cost": 16.86
}
],
"Node Type": "ModifyTable",
"Plan Rows": 5,
"Relation Name": "t_old",
"Alias": "a",
"Parallel Aware": false,
"Operation": "Delete",
"Plan Width": 12,
"Total Cost": 16.86
}
}
]“您的第二个查询不会删除任何内容,这就是为什么它要快得多的原因 编辑:我想我应该解释一下为什么它不会删除任何内容。所以 实际上,你想做的是:
DELETE FROM t_old a
using t_old a2
LEFT JOIN t_prop b ON b.company_id = a2.company_id
where a.company_id=a2.company_id
and b.company_id is null;
它可以比第一个查询更快、更慢或与第一个查询的速度相同,但它也会做同样的事情
但是,如果在t\u prop
中也有行与company\u id
匹配,则您的第二个查询将仅删除t\u old
中的行,因为您在其中进行内部联接。但是还有一个附加条件b.company\u id为null
with将t\u prop
中的额外行限制为仅此列等于null
,但=
运算符不适用于null
值,并且永远不会计算为true
,所以,如果要满足第二个条件,第一个条件总是会失败。考虑到它们之间存在和
,必须同时满足这两个条件,这是不可能的
如果公司id为NULL,则可以删除t_old中的行,并且这些行也在t_prop中,且满足相同条件,则可以删除这些行:
DELETE FROM t_old a
using t_prop b
where a.company_id IS NOT DISTINCT FROM b.company_id
and b.company_id is null;
但是它仍然不能做第一个查询所做的事情。您可以从解释计划中看到,第二个查询使用索引扫描,这要快得多。join通常比
子查询中的更快