Python 3.x 在Alembic迁移期间更新列内容_Python 3.x_Alembic_Sqlalchemy Migrate

Python 3.x 在Alembic迁移期间更新列内容

python-3.x

Python 3.x 在Alembic迁移期间更新列内容,python-3.x,alembic,sqlalchemy-migrate,Python 3.x,Alembic,Sqlalchemy Migrate,假设我的数据库模型包含一个对象User： Base = declarative_base() class User(Base): __tablename__ = 'users' id = Column(String(32), primary_ke

假设我的数据库模型包含一个对象

User

：

Base = declarative_base() 

class User(Base):                                                               
    __tablename__ = 'users'                                                     

    id = Column(String(32), primary_key=True, default=...) 
    name = Column(Unicode(100))

我的数据库包含一个包含n行的

users

表。在某个时刻，我决定将

名称

分为

名字

和

姓氏

，在此期间，我希望我的数据也被迁移

自动生成的迁移如下所示：

def upgrade():
    op.add_column('users', sa.Column('lastname', sa.Unicode(length=50), nullable=True))
    op.add_column('users', sa.Column('firstname', sa.Unicode(length=50), nullable=True))

    # Assuming that the two new columns have been committed and exist at
    # this point, I would like to iterate over all rows of the name column,
    # split the string, write it into the new firstname and lastname rows,
    # and once that has completed, continue to delete the name column.

    op.drop_column('users', 'name')                                             

def downgrade():
    op.add_column('users', sa.Column('name', sa.Unicode(length=100), nullable=True))

    # Do the reverse of the above.

    op.drop_column('users', 'firstname')                                        
    op.drop_column('users', 'lastname')

对于这个问题，似乎有多种或多或少的黑客解决方案。两者都建议在迁移期间使用和执行原始SQL语句。导入当前的db模型，但当该模型更改时，该方法是脆弱的

如何在Alembic迁移期间迁移和修改列数据的现有内容？推荐的方法是什么，在哪里有文档记录？

alembic是一种模式迁移工具，而不是数据迁移。虽然它也可以这样使用。这就是为什么你不会在上面找到很多文档。也就是说，我将创建三个单独的修订：

添加

firstname

和

lastname

而不删除

name

像在应用程序中一样阅读所有用户，并拆分他们的姓名，然后首先更新

和

最后更新。e、 g
for user in session.query(User).all():
    user.firstname, user.lastname = user.name.split(' ')
session.commit()


删除name

在中提出的解决方案一开始听起来不错，但我认为它有一个基本缺陷：它会在两个步骤之间引入多个事务，数据库将处于一种令人不安的、不一致的状态。对于我来说，工具在没有数据库数据的情况下迁移数据库模式似乎也很奇怪（请参阅）；这两者联系得太紧密，无法分开
经过几番探索和几次对话（请参阅中的代码片段），我决定采用以下解决方案：
def upgrade():

    # Schema migration: add all the new columns.
    op.add_column('users', sa.Column('lastname', sa.Unicode(length=50), nullable=True))
    op.add_column('users', sa.Column('firstname', sa.Unicode(length=50), nullable=True))

    # Data migration: takes a few steps...
    # Declare ORM table views. Note that the view contains old and new columns!        
    t_users = sa.Table(
        'users',
        sa.MetaData(),
        sa.Column('id', sa.String(32)),
        sa.Column('name', sa.Unicode(length=100)), # Old column.
        sa.Column('lastname', sa.Unicode(length=50)), # Two new columns.
        sa.Column('firstname', sa.Unicode(length=50)),
        )
    # Use Alchemy's connection and transaction to noodle over the data.
    connection = op.get_bind()
    # Select all existing names that need migrating.
    results = connection.execute(sa.select([
        t_users.c.id,
        t_users.c.name,
        ])).fetchall()
    # Iterate over all selected data tuples.
    for id_, name in results:
        # Split the existing name into first and last.
        firstname, lastname = name.rsplit(' ', 1)
        # Update the new columns.
        connection.execute(t_users.update().where(t_users.c.id == id_).values(
            lastname=lastname,
            firstname=firstname,
            ))

    # Schema migration: drop the old column.
    op.drop_column('users', 'name')                                             

关于此解决方案的两条评论：
def upgrade():

    # Schema migration: add all the new columns.
    op.add_column('users', sa.Column('lastname', sa.Unicode(length=50), nullable=True))
    op.add_column('users', sa.Column('firstname', sa.Unicode(length=50), nullable=True))

    # Data migration: takes a few steps...
    # Declare ORM table views. Note that the view contains old and new columns!        
    t_users = sa.Table(
        'users',
        sa.MetaData(),
        sa.Column('id', sa.String(32)),
        sa.Column('name', sa.Unicode(length=100)), # Old column.
        sa.Column('lastname', sa.Unicode(length=50)), # Two new columns.
        sa.Column('firstname', sa.Unicode(length=50)),
        )
    # Use Alchemy's connection and transaction to noodle over the data.
    connection = op.get_bind()
    # Select all existing names that need migrating.
    results = connection.execute(sa.select([
        t_users.c.id,
        t_users.c.name,
        ])).fetchall()
    # Iterate over all selected data tuples.
    for id_, name in results:
        # Split the existing name into first and last.
        firstname, lastname = name.rsplit(' ', 1)
        # Update the new columns.
        connection.execute(t_users.update().where(t_users.c.id == id_).values(
            lastname=lastname,
            firstname=firstname,
            ))

    # Schema migration: drop the old column.
    op.drop_column('users', 'name')                                             

如参考要点中所述，Alembic的较新版本具有稍微不同的符号
根据DB驱动程序的不同，代码的行为可能有所不同。显然，MySQL不会将上述代码作为单个事务处理（请参阅）。因此，您必须检查您的DB实现
grade（）
功能可以类似地实现
附录。有关将模式迁移与数据迁移配对的示例，请参阅Alembic Cookbook中的章节。
我认为模式迁移与数据迁移是无法分开的。两者都是同时进行的，如果不沿着数据库的数据迁移，就无法迁移数据库的模式。那么，如果Alembic被设计成只做其中一个，而几乎不做另一个，那它有什么用呢？这是一个很好的答案——但我不得不对你的代码做一点小小的修改。必须通过t_users.c.id
或t_users.c.name
在两个connect.execute
调用中指定列。你能确认这一点，并可能编辑你的答案（或者解释发生了什么，如果你碰巧知道的话）？@daveruinseverything，确认并修复了示例代码；非常感谢。不幸的是，这个答案一开始听起来不错，但有一个根本性的缺陷。至少在一个始终运行的web应用程序的上下文中，这不是一个很好的方法。原因是在一张大桌子上进行数据迁移可能会非常昂贵，因此您通常希望避免在部署应用程序的“热路径”中进行迁移。如果你的DB上已经有一些负载，这会增加它，并真正降低你的应用程序的速度，因此能够在后台运行它很好，可能会限制写入速度。^再加上我已经超过了限制，@norbertby的回答是一个更好的方法，最后一个原因是你可以在做任何不可逆转的事情之前停止。这是一个非常简单的示例，但通常情况下，您的数据迁移可以在测试数据上正常工作，但在生产中，由于数据种类繁多，它会破坏某些东西。如果您将数据保留在“姓名”和“名字”列中，那么如果出现任何问题，您可以回滚应用程序。如果你已经不可逆转地迁移了它，那么你的状态就糟糕多了。@danny，这就是为什么我在迁移之前对db进行备份，以及为什么我在实时db上尝试迁移之前对备份进行测试。