Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/347.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 从Lambda函数队列状态查询Athena?_Python_Amazon S3_Aws Lambda_Amazon Athena - Fatal编程技术网

Python 从Lambda函数队列状态查询Athena?

Python 从Lambda函数队列状态查询Athena?,python,amazon-s3,aws-lambda,amazon-athena,Python,Amazon S3,Aws Lambda,Amazon Athena,我已经通过athena从lambda函数内部成功地查询s3很长一段时间了,但它突然停止了工作。进一步的调查显示,来自get_query_execution()的响应返回一个“排队”状态(我被认为没有使用该状态?!) 我的代码如下: def run_query(query, database, s3_output, max_execution=5): response = client.start_query_execution( QueryString=query,

我已经通过athena从lambda函数内部成功地查询s3很长一段时间了,但它突然停止了工作。进一步的调查显示,来自get_query_execution()的响应返回一个“排队”状态(我被认为没有使用该状态?!)

我的代码如下:

def run_query(query, database, s3_output, max_execution=5):
    response = client.start_query_execution(
        QueryString=query,
        QueryExecutionContext={
            'Database': database
        },
        ResultConfiguration={
            'OutputLocation': s3_output
    })

    execution_id = response['QueryExecutionId']
    print("QueryExecutionId = " + str(execution_id))
    state  = 'RUNNING'

    while (max_execution > 0 and state in ['RUNNING']):
        max_execution = max_execution - 1
        print("maxexecution=" + str(max_execution))
        response = client.get_query_execution(QueryExecutionId = execution_id)    

        if 'QueryExecution' in response and \
                'Status' in response['QueryExecution'] and \
                'State' in response['QueryExecution']['Status']:

                state = response['QueryExecution']['Status']['State']
                print(state)
                if state == 'SUCCEEDED':
                    print("Query SUCCEEDED: {}".format(execution_id))

                    s3_key = 'athena_output/' + execution_id + '.csv'
                    print(s3_key)
                    local_filename = '/tmp/' + execution_id + '.csv'
                    print(local_filename)

                    rows = []
                    try:
                        print("s3key =" + s3_key)
                        print("localfilename = " + local_filename)
                        s3.Bucket(BUCKET).download_file(s3_key, local_filename)
                        with open(local_filename) as csvfile:
                            reader = csv.DictReader(csvfile)
                            for row in reader:
                                rows.append(row)
                    except botocore.exceptions.ClientError as e:
                        if e.response['Error']['Code'] == "404":
                            print("The object does not exist.")
                            print(e)
                        else:
                            raise
                    return json.dumps(rows)
                elif state == 'FAILED':
                    return False
        time.sleep(10)
    return False
所以它显然是按它应该的方式工作的——只是“排队”状态完全出乎意料,我不知道该怎么办?什么会导致查询执行变得“排队”,我的代码中需要做哪些更改才能适应这种情况?

请查看Apache Affort。雅典娜有最终状态(成功、失败和取消)和中间状态(运行和排队)。QUEUED是查询启动前的正常状态。因此,您可以使用如下代码:

def run_query(query, database, s3_output, max_execution=5):
    response = client.start_query_execution(
        QueryString=query,
        QueryExecutionContext={
            'Database': database
        },
        ResultConfiguration={
            'OutputLocation': s3_output
    })

    execution_id = response['QueryExecutionId']
    print("QueryExecutionId = " + str(execution_id))
    state  = 'QUEUED'

    while (max_execution > 0 and state in ['RUNNING', 'QUEUED']):
        max_execution = max_execution - 1
        print("maxexecution=" + str(max_execution))
        response = client.get_query_execution(QueryExecutionId = execution_id)    

        if 'QueryExecution' in response and \
                'Status' in response['QueryExecution'] and \
                'State' in response['QueryExecution']['Status']:

                state = response['QueryExecution']['Status']['State']
                print(state)
                if state == 'SUCCEEDED':
                    print("Query SUCCEEDED: {}".format(execution_id))

                    s3_key = 'athena_output/' + execution_id + '.csv'
                    print(s3_key)
                    local_filename = '/tmp/' + execution_id + '.csv'
                    print(local_filename)

                    rows = []
                    try:
                        print("s3key =" + s3_key)
                        print("localfilename = " + local_filename)
                        s3.Bucket(BUCKET).download_file(s3_key, local_filename)
                        with open(local_filename) as csvfile:
                            reader = csv.DictReader(csvfile)
                            for row in reader:
                                rows.append(row)
                    except botocore.exceptions.ClientError as e:
                        if e.response['Error']['Code'] == "404":
                            print("The object does not exist.")
                            print(e)
                        else:
                            raise
                    return json.dumps(rows)
                elif state == 'FAILED' or state == 'CANCELLED':
                    return False
        time.sleep(10)
    return False

从AWS获得此响应-雅典娜发生了一些更改,导致了此问题(尽管排队的已处于状态enum一段时间,但直到现在还没有使用):

Athena团队最近为Athena部署了一系列新功能,包括用于Athena查询的更细粒度的CloudWatch度量

有关更多信息:

  • 自动气象站

  • 雅典娜医生

作为更细粒度度量部署的一部分,Athena现在为查询提供了一个
QUEUED
状态。此状态表示Athena查询正在等待分配资源进行处理。查询流程大致如下:

SUBMITTED -> QUEUED -> RUNNING -> COMPLETED/FAILED
请注意,由于系统错误而失败的查询可以放回队列并重试

我为这一变化造成的挫折道歉

论坛格式似乎已经从代码片段中剥离了一些元素。 然而,我认为您的WHILE循环正在处理一系列可能的查询状态,这些状态以前并没有满足
排队的

如果是这样的话,那么是的,向该数组中添加
QUEUED
将允许您的应用程序处理新状态。

您可以确认您的帐户中提交给Athena的查询数量是否增加了吗?如果您没有提交更多查询,是否有其他用户与您同时提交查询?我正在排队等待,也没有其他人在使用雅典娜。一定是雅典娜在AWS计算使用方面的优先级很低。这并不奇怪,因为它是多么便宜,而且计算是免费的,只是S3检索。