Python 红移复制操作在SQLAlchemy中不起作用

Python 红移复制操作在SQLAlchemy中不起作用,python,sqlalchemy,amazon-redshift,Python,Sqlalchemy,Amazon Redshift,我正在尝试在SQLAlchemy中进行红移复制 在psql中执行时,以下SQL将对象从S3存储桶正确复制到红移表中: COPY posts FROM 's3://mybucket/the/key/prefix' WITH CREDENTIALS 'aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretaccesskey' JSON AS 'auto'; 我有几个文件名为 s3://mybucket/the/key/prefi

我正在尝试在SQLAlchemy中进行红移复制

在psql中执行时,以下SQL将对象从S3存储桶正确复制到红移表中:

COPY posts FROM 's3://mybucket/the/key/prefix' 
WITH CREDENTIALS 'aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretaccesskey' 
JSON AS 'auto';
我有几个文件名为

s3://mybucket/the/key/prefix.001.json
s3://mybucket/the/key/prefix.002.json   
etc.
我可以使用select count*from posts验证是否已将新行添加到表中

但是,当我在SQLAlchemy中执行完全相同的SQL表达式时,execute会毫无错误地完成,但不会向表中添加任何行

session = get_redshift_session()
session.bind.execute("COPY posts FROM 's3://mybucket/the/key/prefix' WITH CREDENTIALS aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretaccesskey'    JSON AS 'auto';")
session.commit()
不管我是否做上述的事

from sqlalchemy.sql import text 
session = get_redshift_session()
session.execute(text("COPY posts FROM 's3://mybucket/the/key/prefix' WITH CREDENTIALS aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretaccesskey'    JSON AS 'auto';"))
session.commit()

我成功地使用了核心表达式语言和Connection.execute,而不是ORM和sessions,使用下面的代码将分隔文件复制到红移。也许您可以将其修改为JSON

def copy_s3_to_redshift(conn, s3path, table, aws_access_key, aws_secret_key, delim='\t', uncompress='auto', ignoreheader=None):
    """Copy a TSV file from S3 into redshift.

    Note the CSV option is not used, so quotes and escapes are ignored.  Empty fields are loaded as null.
    Does not commit a transaction.
    :param Connection conn: SQLAlchemy Connection
    :param str uncompress: None, 'gzip', 'lzop', or 'auto' to autodetect from `s3path` extension.
    :param int ignoreheader: Ignore this many initial rows.
    :return: Whatever a copy command returns.
    """
    if uncompress == 'auto':
        uncompress = 'gzip' if s3path.endswith('.gz') else 'lzop' if s3path.endswith('.lzo') else None

    copy = text("""
        copy "{table}"
        from :s3path
        credentials 'aws_access_key_id={aws_access_key};aws_secret_access_key={aws_secret_key}'
        delimiter :delim
        emptyasnull
        ignoreheader :ignoreheader
        compupdate on
        comprows 1000000
        {uncompress};
        """.format(uncompress=uncompress or '', table=text(table), aws_access_key=aws_access_key, aws_secret_key=aws_secret_key))    # copy command doesn't like table name or keys single-quoted
    return conn.execute(copy, s3path=s3path, delim=delim, ignoreheader=ignoreheader or 0)

我基本上也有同样的问题,但在我的情况下,问题更多:

engine = create_engine('...')
engine.execute(text("COPY posts FROM 's3://mybucket/the/key/prefix' WITH CREDENTIALS aws_access_key_id=myaccesskey;aws_secret_access_key=mysecretaccesskey'    JSON AS 'auto';"))
通过单步执行pdb,问题显然是缺少被调用的.commit。我不知道为什么session.commit在您的情况下不起作用,可能会话丢失了对发送的命令的跟踪?所以它可能不会真正解决你的问题

无论如何,作为

鉴于这一需求,SQLAlchemy实现了自己的“自动提交”功能,该功能在所有后端都能完全一致地工作。这是通过检测表示数据更改操作的语句来实现的,即INSERT、UPDATE、DELETE[…]如果该语句是纯文本语句且未设置标志,则使用正则表达式来检测特定后端的INSERT、UPDATE、DELETE以及各种其他命令

因此,有两种解决方案:

text从's3://mybucket/the/key/prefix'复制帖子,凭证为aws\u access\u key\u id=myaccesskey;aws_secret_access_key=mysecretaccesskey'JSON为'auto';。执行选项自动提交=真。 或者,获取红移方言的固定版本。。。我只是说说而已
在为我工作的副本末尾添加提交:

<your copy sql>;commit;