Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/postgresql/10.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Postgresql 同时刷新物化视图会导致表膨胀_Postgresql_Postgresql 9.5_Materialized Views_Vacuum_Autovacuum - Fatal编程技术网

Postgresql 同时刷新物化视图会导致表膨胀

Postgresql 同时刷新物化视图会导致表膨胀,postgresql,postgresql-9.5,materialized-views,vacuum,autovacuum,Postgresql,Postgresql 9.5,Materialized Views,Vacuum,Autovacuum,在PostgreSQL 9.5中,我决定创建一个物化视图“效果”,并计划每小时进行一次并发刷新,因为我希望它始终可用: REFRESH MATERIALIZED VIEW CONCURRENTLY effects; 一开始一切都很好,我的物化视图正在刷新,磁盘空间使用量或多或少保持不变 问题 但一段时间后,磁盘使用量开始线性增长 我得出的结论是,这种增长的原因是物化视图,并从answer运行查询以获得以下结果: what | by

在PostgreSQL 9.5中,我决定创建一个物化视图“效果”,并计划每小时进行一次并发刷新,因为我希望它始终可用:

REFRESH MATERIALIZED VIEW CONCURRENTLY effects;
一开始一切都很好,我的物化视图正在刷新,磁盘空间使用量或多或少保持不变


问题 但一段时间后,磁盘使用量开始线性增长

我得出的结论是,这种增长的原因是物化视图,并从answer运行查询以获得以下结果:

               what                |  bytes/ct   | bytes_pretty | bytes_per_row
-----------------------------------+-------------+--------------+---------------
 core_relation_size                | 32224567296 | 30 GB        |         21140
 visibility_map                    |      991232 | 968 kB       |             0
 free_space_map                    |     7938048 | 7752 kB      |             5
 table_size_incl_toast             | 32233504768 | 30 GB        |         21146
 indexes_size                      | 22975922176 | 21 GB        |         15073
 total_size_incl_toast_and_indexes | 55209426944 | 51 GB        |         36220
 live_rows_in_text_representation  |   316152215 | 302 MB       |           207
 ------------------------------    |             |              |
 row_count                         |     1524278 |              |
 live_tuples                       |      676439 |              |
 dead_tuples                       |     1524208 |              |
(11 rows)
然后,通过运行以下命令,我发现上一次自动清空此表是在两天前:

SELECT relname, n_dead_tup, last_vacuum, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup desc;
我决定手动调用
vacuum(VERBOSE)effects
。它运行了大约半个小时,产生了以下输出:

vacuum (VERBOSE) effects;
INFO:  vacuuming "public.effects"
INFO:  scanned index "effects_idx" to remove 129523454 row versions
DETAIL:  CPU 12.16s/55.76u sec elapsed 119.87 sec

INFO:  scanned index "effects_campaign_created_idx" to remove 129523454 row versions
DETAIL:  CPU 19.11s/154.59u sec elapsed 337.91 sec

INFO:  scanned index "effects_campaign_name_idx" to remove 129523454 row versions
DETAIL:  CPU 28.51s/151.16u sec elapsed 315.51 sec

INFO:  scanned index "effects_campaign_event_type_idx" to remove 129523454 row versions
DETAIL:  CPU 38.60s/373.59u sec elapsed 601.73 sec

INFO:  "effects": removed 129523454 row versions in 3865537 pages
DETAIL:  CPU 59.02s/36.48u sec elapsed 326.43 sec

INFO:  index "effects_idx" now contains 1524208 row versions in 472258 pages
DETAIL:  113679000 index row versions were removed.
463896 index pages have been deleted, 60386 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.

INFO:  index "effects_campaign_created_idx" now contains 1524208 row versions in 664910 pages
DETAIL:  121637488 index row versions were removed.
41014 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  index "effects_campaign_name_idx" now contains 1524208 row versions in 711391 pages
DETAIL:  125650677 index row versions were removed.
696221 index pages have been deleted, 28150 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  index "effects_campaign_event_type_idx" now contains 1524208 row versions in 956018 pages
DETAIL:  127659042 index row versions were removed.
934288 index pages have been deleted, 32105 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  "effects": found 0 removable, 493 nonremovable row versions in 3880239 out of 3933663 pages
DETAIL:  0 dead row versions cannot be removed yet.

There were 666922 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 180.49s/788.60u sec elapsed 1799.42 sec.

INFO:  vacuuming "pg_toast.pg_toast_1371723"
INFO:  index "pg_toast_1371723_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

INFO:  "pg_toast_1371723": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

VACUUM
在这一点上,我认为问题已经解决,并开始思考什么会干扰自动真空。可以肯定的是,我再次运行查询以查找该表的空间使用情况,但令我惊讶的是,它没有改变

只有在我调用
刷新物化视图效果之后不同时执行。直到现在,用于检查表大小的查询的输出才是:

               what                | bytes/ct  | bytes_pretty | bytes_per_row
-----------------------------------+-----------+--------------+---------------
 core_relation_size                | 374005760 | 357 MB       |           245
 visibility_map                    |         0 | 0 bytes      |             0
 free_space_map                    |         0 | 0 bytes      |             0
 table_size_incl_toast             | 374013952 | 357 MB       |           245
 indexes_size                      | 213843968 | 204 MB       |           140
 total_size_incl_toast_and_indexes | 587857920 | 561 MB       |           385
 live_rows_in_text_representation  | 316175512 | 302 MB       |           207
 ------------------------------    |           |              |
 row_count                         |   1524385 |              |
 live_tuples                       |    676439 |              |
 dead_tuples                       |   1524208 |              |
(11 rows)
一切都恢复正常了


问题 问题解决了,但仍有相当多的混乱

  • 谁能解释一下我遇到了什么问题
  • 我今后如何避免这种情况
  • 首先,让我们解释一下膨胀的原因
    REFRESH-MATERIALIZED concurrence
    在中实现,注释很有启发性:

    /*
     * refresh_by_match_merge
     *
     * Refresh a materialized view with transactional semantics, while allowing
     * concurrent reads.
     *
     * This is called after a new version of the data has been created in a
     * temporary table.  It performs a full outer join against the old version of
     * the data, producing "diff" results.  This join cannot work if there are any
     * duplicated rows in either the old or new versions, in the sense that every
     * column would compare as equal between the two rows.  It does work correctly
     * in the face of rows which have at least one NULL value, with all non-NULL
     * columns equal.  The behavior of NULLs on equality tests and on UNIQUE
     * indexes turns out to be quite convenient here; the tests we need to make
     * are consistent with default behavior.  If there is at least one UNIQUE
     * index on the materialized view, we have exactly the guarantee we need.
     *
     * The temporary table used to hold the diff results contains just the TID of
     * the old record (if matched) and the ROW from the new table as a single
     * column of complex record type (if matched).
     *
     * Once we have the diff table, we perform set-based DELETE and INSERT
     * operations against the materialized view, and discard both temporary
     * tables.
     *
     * Everything from the generation of the new data to applying the differences
     * takes place under cover of an ExclusiveLock, since it seems as though we
     * would want to prohibit not only concurrent REFRESH operations, but also
     * incremental maintenance.  It also doesn't seem reasonable or safe to allow
     * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by
     * this command.
     */
    
    因此,通过从临时表中删除行并插入新行来刷新物化视图。这当然会导致死元组和表膨胀,您的
    真空(VERBOSE)
    输出证实了这一点

    在某种程度上,这是您同时为
    支付的价格

    其次,让我们揭穿
    VACUUM
    无法移除死元组的神话
    VACUUM
    将移除死行,但它无法减少膨胀(这可以通过
    VACUUM(FULL)
    实现,但这将锁定视图,就像
    刷新物化视图
    而不同时
    一样

    我怀疑您用来确定死元组数量的查询只是一个估计,它错误地估计了死元组的数量

    一个例子说明了这一切 使用
    pgstattuple
    扩展名精确测量表膨胀:

    CREATE EXTENSION pgstattuple;
    
    SELECT * FROM pgstattuple('tab_v');
    -[ RECORD 1 ]------+--------
    table_len          | 4431872
    tuple_count        | 100000
    tuple_len          | 3788895
    tuple_percent      | 85.49
    dead_tuple_count   | 0
    dead_tuple_len     | 0
    dead_tuple_percent | 0
    free_space         | 16724
    free_percent       | 0.38
    
    现在,让我们删除表中的一些行,刷新并再次测量:

    DELETE FROM tab WHERE id BETWEEN 40001 AND 80000;
    
    REFRESH MATERIALIZED VIEW CONCURRENTLY tab_v;
    
    SELECT * FROM pgstattuple('tab_v');
    -[ RECORD 1 ]------+--------
    table_len          | 4431872
    tuple_count        | 60000
    tuple_len          | 2268895
    tuple_percent      | 51.19
    dead_tuple_count   | 40000
    dead_tuple_len     | 1520000
    dead_tuple_percent | 34.3
    free_space         | 16724
    free_percent       | 0.38
    
    很多死元组<代码>真空
    去除这些:

    VACUUM tab_v;
    
    SELECT * FROM pgstattuple('tab_v');
    -[ RECORD 1 ]------+--------
    table_len          | 4431872
    tuple_count        | 60000
    tuple_len          | 2268895
    tuple_percent      | 51.19
    dead_tuple_count   | 0
    dead_tuple_len     | 0
    dead_tuple_percent | 0
    free_space         | 1616724
    free_percent       | 36.48
    

    死元组消失了,但现在有很多空白。

    我将添加到上面提供的@Laurenz-Albe完整答案中。还有另一种可能导致腹胀。考虑下面的场景:

    您的视图在大多数情况下很少更改(1000000条记录,每个请求更改100条记录),但仍然会得到500000个死元组。其原因在索引列中可以为null

    如上所述,当视图同时物化时,将重新创建并比较副本。比较使用强制的唯一索引,但是空值呢?在sql中,空值永远不会彼此相等。因此,如果主键允许空值,则即使未更改也会始终重新创建包含空值的记录并将其添加到表中


    为了解决这个问题,您可以添加额外的列,将空列合并到一些从未使用过的值(-1,to_timestamp(0),…),并仅将此列用于主索引

    ,谢谢您的详细回答!“我怀疑你用来确定死元组数量的查询只是一个估计,它错误地估计了死元组的数量。”-那么你是说我的
    真空冗余效应
    实际上删除了死元组,但没有减少表消耗的空间吗?“真空可以去除死排,但不能减少膨胀”-有没有办法在不锁定的情况下减少膨胀?在膨胀和锁定之间进行选择使得mat_视图在我的例子中毫无用处。此外,膨胀开始“突然”发生的原因还不太清楚。在某种程度上,视图同时刷新,删除的行被释放,只是在某个时候它开始不断增长。你明白我的意思。有像
    pg_squence
    pgrepack
    这样的工具可以在没有
    VACUUM(FULL)
    的情况下重新组织表格。即使有一些膨胀,物化视图也不应该无限增长,因为(正常)
    真空
    后的空白空间将被重用。从我的案例中看到,它确实线性增长,无限增长。超过100+GB,而不是1 GB。也许这是一种特殊情况,意识到这种可能性是件好事。也许是其他条件触发了它。我决定放弃物化视图方法,因为我每次扫描的表只插入到其中-因此我可以通过增量添加来进行刷新(节省大量工作负载并避免上述问题)。也许您的数据库中有某种东西会阻止自动真空长事务,准备好的事务或复制插槽。
    VACUUM tab_v;
    
    SELECT * FROM pgstattuple('tab_v');
    -[ RECORD 1 ]------+--------
    table_len          | 4431872
    tuple_count        | 60000
    tuple_len          | 2268895
    tuple_percent      | 51.19
    dead_tuple_count   | 0
    dead_tuple_len     | 0
    dead_tuple_percent | 0
    free_space         | 1616724
    free_percent       | 36.48