Python 如何避免在通过sqlalchemy关系添加值时插入重复条目?

Python 如何避免在通过sqlalchemy关系添加值时插入重复条目?,python,sqlalchemy,flask-sqlalchemy,table-relationships,Python,Sqlalchemy,Flask Sqlalchemy,Table Relationships,假设我们有两个多对多关系的表,如下所示: class User(db.Model): __tablename__ = 'user' uid = db.Column(db.String(80), primary_key=True) languages = db.relationship('Language', lazy='dynamic', secondary='user_language') class UserLangu

假设我们有两个多对多关系的表,如下所示:

class User(db.Model):
  __tablename__ = 'user'
  uid = db.Column(db.String(80), primary_key=True)
  languages = db.relationship('Language', lazy='dynamic',
                              secondary='user_language')

class UserLanguage(db.Model):
  __tablename__ = 'user_language'
  __tableargs__ = (db.UniqueConstraint('uid', 'lid', name='user_language_ff'),)

  id = db.Column(db.Integer, primary_key=True)
  uid = db.Column(db.String(80), db.ForeignKey('user.uid'))
  lid = db.Column(db.String(80), db.ForeignKey('language.lid'))

class Language(db.Model):
  lid = db.Column(db.String(80), primary_key=True)
  language_name = db.Column(db.String(30))
现在在python shell中:

In [4]: user = User.query.all()[0]

In [11]: user.languages = [Language('1', 'English')]

In [12]: db.session.commit()

In [13]: user2 = User.query.all()[1]

In [14]: user2.languages = [Language('1', 'English')]

In [15]: db.session.commit()

IntegrityError: (IntegrityError) column lid is not unique u'INSERT INTO language (lid, language_name) VALUES (?, ?)' ('1', 'English')
我如何才能让关系知道它应该忽略重复项,而不打破语言表的唯一约束?当然,我可以分别插入每种语言,并事先检查表中是否已经存在条目,但是sqlalchemy关系提供的许多好处都消失了。

sqlalchemy wiki有一个

不过,这些例子有点复杂。基本上,创建一个classmethod
get_unique
作为替代构造函数,它将首先检查会话缓存,然后尝试查询现有实例,最后创建一个新实例。然后调用
Language.get_unique(id,name)
而不是
Language(id,name)

我已经写了一篇关于OP关于另一个问题的悬赏的回复。

我建议阅读。在这种情况下,您的代码将转换为如下内容:

# NEW: need this function to auto-generate the PK for newly created Language
# here using uuid, but could be any generator
def _newid():
    import uuid
    return str(uuid.uuid4())

def _language_find_or_create(language_name):
    language = Language.query.filter_by(language_name=language_name).first()
    return language or Language(language_name=language_name)


class User(Base):
  __tablename__ = 'user'
  uid = Column(String(80), primary_key=True)
  languages = relationship('Language', lazy='dynamic',
                              secondary='user_language')

  # proxy the 'language_name' attribute from the 'languages' relationship
  langs = association_proxy('languages', 'language_name',
            creator=_language_find_or_create,
            )

class UserLanguage(Base):
  __tablename__ = 'user_language'
  __tableargs__ = (UniqueConstraint('uid', 'lid', name='user_language_ff'),)

  id = Column(Integer, primary_key=True)
  uid = Column(String(80), ForeignKey('user.uid'))
  lid = Column(String(80), ForeignKey('language.lid'))

class Language(Base):
  __tablename__ = 'language'
  # NEW: added a *default* here; replace with your implementation
  lid = Column(String(80), primary_key=True, default=_newid)
  language_name = Column(String(30))

# test code
user = User(uid="user-1")
# NEW: add languages using association_proxy property
user.langs.append("English")
user.langs.append("Spanish")
session.add(user)
session.commit()

user2 = User(uid="user-2")
user2.langs.append("English") # this will not create a new Language row...
user2.langs.append("German")
session.add(user2)
session.commit()

AttributeError:type对象“Language”没有属性“query”
OP使用的是
flask-sqlalchemy
,它将此添加到每个模型中。您可以将其替换为
session.query(Language)
。但是查询并不能解决OP的问题,在后续多次插入的情况下,通过将
user.languages
设置为
Language
对象的集合并进行提交,将大大降低许多用户添加到数据库中的速度。是否有一种解决方案可以创建“如果不存在”(PostgresQL中的upsert)比每次检查它是否存在更有效?如果列表中的值类似于几百个字符串,那么可以在当前值类旁边保留一个
dict()