Python Sqlalchemy中的慢速批量保存对象
所以我遇到了一个插入速度非常慢的问题,我插入了223个项目,需要20多秒才能执行。有没有关于我做错了什么以及为什么会这么慢的建议?使用Postgresql 9.4.8 以下是表架构:Python Sqlalchemy中的慢速批量保存对象,python,postgresql,sqlalchemy,Python,Postgresql,Sqlalchemy,所以我遇到了一个插入速度非常慢的问题,我插入了223个项目,需要20多秒才能执行。有没有关于我做错了什么以及为什么会这么慢的建议?使用Postgresql 9.4.8 以下是表架构: Table "public.trial_locations" Column | Type | Modifiers
Table "public.trial_locations"
Column | Type | Modifiers
-------------+------------------------+--------------------------------------------------------------
id | integer | not null default nextval('trial_locations_id_seq'::regclass)
status | character varying(255) |
trial_id | integer |
location_id | integer |
active | boolean |
Indexes:
"trial_locations_pkey" PRIMARY KEY, btree (id)
"trial_locations_unique_1" UNIQUE CONSTRAINT, btree (trial_id, location_id)
Foreign-key constraints:
"trial_locations_location_id_fkey" FOREIGN KEY (location_id) REFERENCES locations(id)
"trial_locations_trial_id_fkey" FOREIGN KEY (trial_id) REFERENCES trials(id)
代码行
for key, unique_new_location in unique_locations_hash.iteritems():
trial_location_inserts.append(TrialLocations(trial_id = current_trial.id, location_id = unique_new_location['location_id'], active = True, status = unique_new_location['status']))
LOG_OUTPUT('==========PRE BULK==========', True)
db_session.bulk_save_objects(trial_location_inserts)
LOG_OUTPUT('==========POST BULK==========', True)
这是sqlalchemy echo的日志
2016-12-23 07:37:52.570: ==========PRE BULK==========
2016-12-22 23:37:52,572 INFO sqlalchemy.engine.base.Engine INSERT INTO trial_locations (status, trial_id, location_id, active) VALUES (%(status)s, %(trial_id)s, %(location_id)s, %(active)s)
2016-12-22 23:37:52,572 INFO sqlalchemy.engine.base.Engine ({'status': u'Completed', 'active': True, 'location_id': 733, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 716, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1033, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1548, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1283, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1556, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 4271, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1567, 'trial_id': 126625} ... displaying 10 of 223 total bound parameter sets ... {'status': u'Completed', 'active': True, 'location_id': 1528, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1529, 'trial_id': 126625})
2016-12-23 07:38:14.270: ==========POST BULK==========
编辑:
同样为了比较,我在Sqlalchemy core中重写了它
if len(trial_location_inserts) > 0:
LOG_OUTPUT('==========PRE BULK==========', True)
engine.execute(
TrialLocations.__table__.insert().values(
trial_location_core_inserts
)
)
# db_session.bulk_save_objects(trial_location_inserts)
LOG_OUTPUT('==========POST BULK==========', True)
它运行了0.028秒
2016-12-23 08:11:26.097: ==========PRE BULK==========
...
2016-12-23 08:11:27.025: ==========POST BULK==========
为了事务的缘故,我想让它保持会话状态,但如果核心是唯一的方式,我想就是这样了
谢谢你的帮助