Multithreading 如何对postgres执行多个并发insert事务而不导致死锁?
我有一个大的转储文件,我正在并行处理并插入到postgres 9.4.5数据库中。共有10个进程,所有进程都在启动一个事务,插入~X000个对象,然后提交,并重复执行,直到完成其文件块但由于数据库锁定,它们永远不会完成。 转储包含500万个左右的对象,每个对象代表一个相册。一个对象有一个标题,一个发布日期,一个艺术家列表,一个曲目名称列表等等。我为其中的每一个对象都有一个发布表(谁的主键来自转储中的对象),然后用它们自己的主键连接表,比如release\u artist,release\u track 这些表如下所示:Multithreading 如何对postgres执行多个并发insert事务而不导致死锁?,multithreading,postgresql,concurrency,transactions,database-deadlocks,Multithreading,Postgresql,Concurrency,Transactions,Database Deadlocks,我有一个大的转储文件,我正在并行处理并插入到postgres 9.4.5数据库中。共有10个进程,所有进程都在启动一个事务,插入~X000个对象,然后提交,并重复执行,直到完成其文件块但由于数据库锁定,它们永远不会完成。 转储包含500万个左右的对象,每个对象代表一个相册。一个对象有一个标题,一个发布日期,一个艺术家列表,一个曲目名称列表等等。我为其中的每一个对象都有一个发布表(谁的主键来自转储中的对象),然后用它们自己的主键连接表,比如release\u artist,release\u tr
Table: mdc_releases
Column | Type | Modifiers | Storage | Stats target | Description
-----------+--------------------------+-----------+----------+--------------+-------------
id | integer | not null | plain | |
title | text | | extended | |
released | timestamp with time zone | | plain | |
Indexes:
"mdc_releases_pkey" PRIMARY KEY, btree (id)
Table: mdc_release_artists
Column | Type | Modifiers | Storage | Stats target | Description
------------+---------+------------------------------------------------------------------+---------+--------------+-------------
id | integer | not null default nextval('mdc_release_artists_id_seq'::regclass) | plain | |
release_id | integer | | plain | |
artist_id | integer | | plain | |
Indexes:
"mdc_release_artists_pkey" PRIMARY KEY, btree (id)
insert into release(...) values(...) returning id; // refer to id below as $ID
insert into release_meta(release_id, ...) values ($ID, ...);
insert into release_artists(release_id, ...) values ($ID, ...), ($ID, ...), ...;
insert into release_tracks(release_id, ...) values ($ID, ...), ($ID, ...), ...;
插入一个对象如下所示:
Table: mdc_releases
Column | Type | Modifiers | Storage | Stats target | Description
-----------+--------------------------+-----------+----------+--------------+-------------
id | integer | not null | plain | |
title | text | | extended | |
released | timestamp with time zone | | plain | |
Indexes:
"mdc_releases_pkey" PRIMARY KEY, btree (id)
Table: mdc_release_artists
Column | Type | Modifiers | Storage | Stats target | Description
------------+---------+------------------------------------------------------------------+---------+--------------+-------------
id | integer | not null default nextval('mdc_release_artists_id_seq'::regclass) | plain | |
release_id | integer | | plain | |
artist_id | integer | | plain | |
Indexes:
"mdc_release_artists_pkey" PRIMARY KEY, btree (id)
insert into release(...) values(...) returning id; // refer to id below as $ID
insert into release_meta(release_id, ...) values ($ID, ...);
insert into release_artists(release_id, ...) values ($ID, ...), ($ID, ...), ...;
insert into release_tracks(release_id, ...) values ($ID, ...), ($ID, ...), ...;
因此,事务看起来像BEGIN,上面的代码片段5000次,COMMIT。我在谷歌上搜索了一下,我不知道为什么看起来像独立插入的东西会导致死锁
这是从pg_stat_活动中选择*显示的内容:
| state_change | waiting | state | backend_xid | backend_xmin | query
+-------------------------------+---------+---------------------+-------------+--------------+---------------------------------
| 2016-01-04 18:42:35.542629-08 | f | active | | 2597876 | select * from pg_stat_activity;
| 2016-01-04 07:36:06.730736-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:37:36.066837-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:37:36.314909-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:37:49.491939-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:36:04.865133-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:38:39.344163-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:36:48.400621-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:34:37.802813-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:37:24.615981-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:37:10.887804-08 | f | idle in transaction | | | BEGIN
| 2016-01-04 07:37:44.200148-08 | f | idle in transaction | | | BEGIN
是插入
版本的id
,还是自动生成?如果是前者,则可以通过insert into release\u meta
etc插入相同的值,而不是让后续插入依赖于初始插入;如果是后者,那么我认为您需要在发布:id
列中添加一个自动增量默认值,是插入发布的id
,还是自动生成?如果是前者,则可以通过insert into release\u meta
etc插入相同的值,而不是让后续插入依赖于初始插入;如果是后者,那么我认为您需要在releases:id
列中添加一个自动增量默认值