Python 如何使用SqlAlchemy进行升级?
我有一条记录,如果它不存在,我希望它存在于数据库中;如果它已经存在(主键存在),我希望字段更新为当前状态。这通常被称为一个 以下不完整的代码片段演示了如何工作,但它似乎过于笨重(特别是在有更多列的情况下)。更好/最好的方法是什么Python 如何使用SqlAlchemy进行升级?,python,sqlalchemy,upsert,Python,Sqlalchemy,Upsert,我有一条记录,如果它不存在,我希望它存在于数据库中;如果它已经存在(主键存在),我希望字段更新为当前状态。这通常被称为一个 以下不完整的代码片段演示了如何工作,但它似乎过于笨重(特别是在有更多列的情况下)。更好/最好的方法是什么 Base = declarative_base() class Template(Base): __tablename__ = 'templates' id = Column(Integer, primary_key = True) name =
Base = declarative_base()
class Template(Base):
__tablename__ = 'templates'
id = Column(Integer, primary_key = True)
name = Column(String(80), unique = True, index = True)
template = Column(String(80), unique = True)
description = Column(String(200))
def __init__(self, Name, Template, Desc):
self.name = Name
self.template = Template
self.description = Desc
def UpsertDefaultTemplate():
sess = Session()
desired_default = Template("default", "AABBCC", "This is the default template")
try:
q = sess.query(Template).filter_by(name = desiredDefault.name)
existing_default = q.one()
except sqlalchemy.orm.exc.NoResultFound:
#default does not exist yet, so add it...
sess.add(desired_default)
else:
#default already exists. Make sure the values are what we want...
assert isinstance(existing_default, Template)
existing_default.name = desired_default.name
existing_default.template = desired_default.template
existing_default.description = desired_default.description
sess.flush()
有没有一种更好或更不冗长的方法来做这件事?像这样的东西会很棒:
sess.upsert_this(desired_default, unique_key = "name")
虽然unique_键
kwarg显然是不必要的(ORM应该能够很容易地解决这个问题),但我添加它只是因为SQLAlchemy往往只与主键一起工作。我一直在考虑是否适用,但这只适用于主键,在本例中,主键是一个自动递增的id,对于这个目的来说并不是非常有用
这方面的一个示例用例是在启动可能已升级其默认预期数据的服务器应用程序时。ie:此升级没有并发问题。SQLAlchemy确实有一个“保存或更新”行为,在最近的版本中,该行为已内置到会话中。添加
,但以前是单独的会话。保存或更新
调用。这不是一个“升级”,但它可能足以满足您的需要
最好是询问一个具有多个唯一键的类;我相信这正是没有单一正确方法的原因。主键也是唯一的键。如果没有唯一的约束,只有主键,这将是一个足够简单的问题:如果给定ID不存在,或者如果ID为None,则创建一个新记录;否则,使用该主键更新现有记录中的所有其他字段
然而,当存在额外的独特约束时,这种简单方法存在逻辑问题。如果要“向上插入”对象,并且对象的主键与现有记录匹配,但另一个唯一列与另一个记录匹配,那么该怎么办?类似地,如果主键不匹配任何现有记录,但另一个唯一列与现有记录匹配,那么该怎么办?对于你的特殊情况,可能有一个正确的答案,但总的来说,我认为没有一个单一的正确答案
这就是为什么没有内置的“upsert”操作。应用程序必须定义在每个特定情况下这意味着什么。SQLAlchemy支持冲突上的
,有两种方法冲突上的
和冲突上的
复制自:
我采用“三思而后行”的方法:
# first get the object from the database if it exists
# we're guaranteed to only get one or zero results
# because we're filtering by primary key
switch_command = session.query(Switch_Command).\
filter(Switch_Command.switch_id == switch.id).\
filter(Switch_Command.command_id == command.id).first()
# If we didn't get anything, make one
if not switch_command:
switch_command = Switch_Command(switch_id=switch.id, command_id=command.id)
# update the stuff we care about
switch_command.output = 'Hooray!'
switch_command.lastseen = datetime.datetime.utcnow()
session.add(switch_command)
# This will generate either an INSERT or UPDATE
# depending on whether we have a new object or not
session.commit()
优点是它是db中性的,我认为读起来很清楚。缺点是在如下情况下存在潜在的竞态条件:
- 我们在数据库中查询一个
,但没有找到switch\u命令
- 我们创建一个
switch\u命令
- 另一个进程或线程使用与我们相同的主键创建一个
switch\u命令
- 我们尝试提交
switch\u命令
ForeignKey
s
我正在使用我编写的以下函数来处理这两个问题:
def upsert(session, model, rows):
table = model.__table__
stmt = postgresql.insert(table)
primary_keys = [key.name for key in inspect(table).primary_key]
update_dict = {c.name: c for c in stmt.excluded if not c.primary_key}
if not update_dict:
raise ValueError("insert_or_update resulted in an empty update_dict")
stmt = stmt.on_conflict_do_update(index_elements=primary_keys,
set_=update_dict)
seen = set()
foreign_keys = {col.name: list(col.foreign_keys)[0].column for col in table.columns if col.foreign_keys}
unique_constraints = [c for c in table.constraints if isinstance(c, UniqueConstraint)]
def handle_foreignkeys_constraints(row):
for c_name, c_value in foreign_keys.items():
foreign_obj = row.pop(c_value.table.name, None)
row[c_name] = getattr(foreign_obj, c_value.name) if foreign_obj else None
for const in unique_constraints:
unique = tuple([const,] + [row[col.name] for col in const.columns])
if unique in seen:
return None
seen.add(unique)
return row
rows = list(filter(None, (handle_foreignkeys_constraints(row) for row in rows)))
session.execute(stmt, rows)
这对我使用sqlite3和postgres很有效。尽管它可能会因组合主键约束而失败,并且很可能会因附加的唯一约束而失败
try:
t = self._meta.tables[data['table']]
except KeyError:
self._log.error('table "%s" unknown', data['table'])
return
try:
q = insert(t, values=data['values'])
self._log.debug(q)
self._db.execute(q)
except IntegrityError:
self._log.warning('integrity error')
where_clause = [c.__eq__(data['values'][c.name]) for c in t.c if c.primary_key]
update_dict = {c.name: data['values'][c.name] for c in t.c if not c.primary_key}
q = update(t, values=update_dict).where(*where_clause)
self._log.debug(q)
self._db.execute(q)
except Exception as e:
self._log.error('%s: %s', t.name, e)
以下内容适用于我的红移数据库,也适用于组合主键约束 资料来源: 在函数中创建SQLAlchemy引擎只需要很少的修改 def启动发动机()
来自sqlalchemy导入列、整数、日期、元数据
从sqlalchemy.ext.declarative导入声明性基础
从sqlalchemy.dialogs.postgresql导入插入
从sqlalchemy导入创建引擎
从sqlalchemy.orm导入sessionmaker
从sqlalchemy.dialogs导入postgresql
Base=声明性_Base()
def start_发动机():
引擎=创建引擎(os.getenv('SQLALCHEMY\u URI'),
“postgresql://localhost:5432/upsert'))
connect=引擎。connect()
元数据=元数据(绑定=引擎)
meta.reflect(绑定=引擎)
回程发动机
类别(基本):
__tablename_uuuuu=‘数字消费’
报告日期=列(日期,可空=假)
日期=列(日期,可空=假,主键=真)
印象=列(整数)
转换=列(整数)
定义报告(自我):
返回str([getattr(self,c.name,None)表示self.\uu表\uuuu.c]中的c])
def编译查询(查询):
compiler=query.compile if not hasattr(查询,“语句”)else
query.statement.compile
返回编译器(方言=postgresql.dialogue())
def upsert(会话、模型、行,如“日期的报告”列,不更新列=[]):
表=型号。\u表__
stmt=插入(表)。值(行)
update_cols=[c.name代表表.c中的c
如果c不在列表中(表主键列)
和c.名称不在no_update_cols中]
on\u conflict\u stmt=stmt.on\u conflict\u do\u update(
索引元素=table.primary\u key.columns,
在update\u cols}中为k设置{k:getattr(stmt.excluded,k),
索引,其中=(getattr(模型,作为日期的集合) try:
t = self._meta.tables[data['table']]
except KeyError:
self._log.error('table "%s" unknown', data['table'])
return
try:
q = insert(t, values=data['values'])
self._log.debug(q)
self._db.execute(q)
except IntegrityError:
self._log.warning('integrity error')
where_clause = [c.__eq__(data['values'][c.name]) for c in t.c if c.primary_key]
update_dict = {c.name: data['values'][c.name] for c in t.c if not c.primary_key}
q = update(t, values=update_dict).where(*where_clause)
self._log.debug(q)
self._db.execute(q)
except Exception as e:
self._log.error('%s: %s', t.name, e)
from sqlalchemy import Column, Integer, Date ,Metadata
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.dialects.postgresql import insert
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
from sqlalchemy.dialects import postgresql
Base = declarative_base()
def start_engine():
engine = create_engine(os.getenv('SQLALCHEMY_URI',
'postgresql://localhost:5432/upsert'))
connect = engine.connect()
meta = MetaData(bind=engine)
meta.reflect(bind=engine)
return engine
class DigitalSpend(Base):
__tablename__ = 'digital_spend'
report_date = Column(Date, nullable=False)
day = Column(Date, nullable=False, primary_key=True)
impressions = Column(Integer)
conversions = Column(Integer)
def __repr__(self):
return str([getattr(self, c.name, None) for c in self.__table__.c])
def compile_query(query):
compiler = query.compile if not hasattr(query, 'statement') else
query.statement.compile
return compiler(dialect=postgresql.dialect())
def upsert(session, model, rows, as_of_date_col='report_date', no_update_cols=[]):
table = model.__table__
stmt = insert(table).values(rows)
update_cols = [c.name for c in table.c
if c not in list(table.primary_key.columns)
and c.name not in no_update_cols]
on_conflict_stmt = stmt.on_conflict_do_update(
index_elements=table.primary_key.columns,
set_={k: getattr(stmt.excluded, k) for k in update_cols},
index_where=(getattr(model, as_of_date_col) < getattr(stmt.excluded, as_of_date_col))
)
print(compile_query(on_conflict_stmt))
session.execute(on_conflict_stmt)
session = start_engine()
upsert(session, DigitalSpend, initial_rows, no_update_cols=['conversions'])
def get_class_by_tablename(tablename):
"""Return class reference mapped to table.
https://stackoverflow.com/questions/11668355/sqlalchemy-get-model-from-table-name-this-may-imply-appending-some-function-to
:param tablename: String with name of table.
:return: Class reference or None.
"""
for c in Base._decl_class_registry.values():
if hasattr(c, '__tablename__') and c.__tablename__ == tablename:
return c
sqla_tbl = get_class_by_tablename(table_name)
def handle_upsert(record_dict, table):
"""
handles updates when there are primary key conflicts
"""
try:
self.active_session().add(table(**record_dict))
except:
# Here we'll assume the error is caused by an integrity error
# We do this because the error classes are passed from the
# underlying package (pyodbc / sqllite) SQLAlchemy doesn't mask
# them with it's own code - this should be updated to have
# explicit error handling for each new db engine
# <update>add explicit error handling for each db engine</update>
active_session.rollback()
# Query for conflic class, use update method to change values based on dict
c_tbl_primary_keys = [i.name for i in table.__table__.primary_key] # List of primary key col names
c_tbl_cols = dict(sqla_tbl.__table__.columns) # String:Col Object crosswalk
c_query_dict = {k:record_dict[k] for k in c_tbl_primary_keys if k in record_dict} # sub-dict from data of primary key:values
c_oo_query_dict = {c_tbl_cols[k]:v for (k,v) in c_query_dict.items()} # col-object:query value for primary key cols
c_target_record = session.query(sqla_tbl).filter(*[k==v for (k,v) in oo_query_dict.items()]).first()
# apply new data values to the existing record
for k, v in record_dict.items()
setattr(c_target_record, k, v)