Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/amazon-s3/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Amazon web services 空气流通及;boto:S3响应错误:403禁止_Amazon Web Services_Amazon S3_Boto_Airflow - Fatal编程技术网

Amazon web services 空气流通及;boto:S3响应错误:403禁止

Amazon web services 空气流通及;boto:S3响应错误:403禁止,amazon-web-services,amazon-s3,boto,airflow,Amazon Web Services,Amazon S3,Boto,Airflow,我正在使用boto2访问我的S3存储桶,但在使用Apache Airflow Scheduler获得一致的结果时遇到了问题。我有一个脚本从S3获取我的bucket,并将一个zip文件上传到该bucket。我让脚本在带有bash操作符的气流调度器上运行。当我在电脑前,或者当我手动运行脚本时,日程安排似乎工作正常。我注意到,当计算机空闲时,脚本将抛出403禁止的错误。例如,我让脚本每天运行一次,并将在一周内使用气流成功执行脚本,而一旦周末开始,脚本将无法运行,并抛出403错误。知道为什么会出现这种行

我正在使用boto2访问我的S3存储桶,但在使用Apache Airflow Scheduler获得一致的结果时遇到了问题。我有一个脚本从S3获取我的bucket,并将一个zip文件上传到该bucket。我让脚本在带有bash操作符的气流调度器上运行。当我在电脑前,或者当我手动运行脚本时,日程安排似乎工作正常。我注意到,当计算机空闲时,脚本将抛出403禁止的错误。例如,我让脚本每天运行一次,并将在一周内使用气流成功执行脚本,而一旦周末开始,脚本将无法运行,并抛出403错误。知道为什么会出现这种行为模式吗?我的aws帐户应该对S3具有完全访问权限。我有来自docker容器的气流

下面是我的脚本中访问bucket的函数:

def upload_s3(string_passed):
    s3 = boto.connect_s3()
    aabucket = s3.get_bucket('bucketname',validate='False')
    k = Key(aabucket)
    k.key = datetime.datetime.now().strftime("%Y%m%d-%H%M%S")+str('.gz') #filename
    k.set_contents_from_string(string_passed) #send data to s3
    print("S3 UPLOAD SUCCESSFUL!")
以下是我正在运行的dag配置脚本:

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from datetime import datetime, timedelta

schedule_interval="00 18 * * *"

args = {

    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(year=2017, month=03, day=21, hour=18 ,minute=00, second=00),
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 4,
    'retry_delay': timedelta(minutes=2)
}

dag = DAG(
    dag_id = 'appannie_bash_operator_v10',
    default_args = args,
    schedule_interval = schedule_interval
)

t1 = BashOperator(
    task_id = 'get_json',
    bash_command = 'python ~/appannie/appannie_scrape_update.py',
    dag=dag,
)

t2 = BashOperator(
    task_id = 'export_s3',
    bash_command = 'python --version',
    dag = dag,
)
t2.set_upstream(t1)
以下是我收到的日志错误:

[2017-04-04 21:50:28,996] {models.py:1219} INFO - Executing <Task(BashOperator): get_json> on 2017-04-03 18:00:00
[2017-04-04 21:50:29,156] {bash_operator.py:55} INFO - tmp dir root location: 
/tmp
[2017-04-04 21:50:29,157] {bash_operator.py:64} INFO - Temporary script location :/tmp/airflowtmp5_nqrG//tmp/airflowtmp5_nqrG/get_json0aZng2
[2017-04-04 21:50:29,158] {bash_operator.py:65} INFO - Running command: python ~/appannie/appannie_scrape_update.py
[2017-04-04 21:50:29,162] {bash_operator.py:73} INFO - Output:
[2017-04-04 21:51:06,057] {bash_operator.py:77} INFO - JSON DOWNLOAD SUCCESSFUL!
[2017-04-04 21:51:06,368] {bash_operator.py:77} INFO - Traceback (most recent call last):
[2017-04-04 21:51:06,678] {bash_operator.py:77} INFO - File "/usr/local/airflow/appannie/appannie_scrape_update.py", line 244, in <module>
[2017-04-04 21:51:06,679] {bash_operator.py:77} INFO - upload_s3(x2)
[2017-04-04 21:51:06,680] {bash_operator.py:77} INFO - File "/usr/local/airflow/appannie/appannie_scrape_update.py", line 41, in upload_s3
[2017-04-04 21:51:06,680] {bash_operator.py:77} INFO - aabucket = s3.get_bucket('nixhydra-appannie',validate='False')
[2017-04-04 21:51:06,681] {bash_operator.py:77} INFO - File "/usr/local/airflow/.local/lib/python2.7/site-packages/boto/s3/connection.py", line 506, in get_bucket
[2017-04-04 21:51:06,681] {bash_operator.py:77} INFO - return self.head_bucket(bucket_name, headers=headers)
[2017-04-04 21:51:06,682] {bash_operator.py:77} INFO - File "/usr/local/airflow/.local/lib/python2.7/site-packages/boto/s3/connection.py", line 539, in head_bucket
[2017-04-04 21:51:06,682] {bash_operator.py:77} INFO - raise err
[2017-04-04 21:51:06,683] {bash_operator.py:77} INFO - boto.exception.S3ResponseError: S3ResponseError: 403 Forbidden
[2017-04-04 21:51:06,683] {bash_operator.py:77} INFO - 
[2017-04-04 21:51:06,684] {bash_operator.py:80} INFO - Command exited with return code 1
[2017-04-04 21:51:06,685] {models.py:1286} ERROR - Bash command failed
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/airflow/models.py", line 1245, in run
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python2.7/dist-packages/airflow/operators/bash_operator.py", line 83, in execute
    raise AirflowException("Bash command failed")
AirflowException: Bash command failed
[2017-04-04 21:51:06,686] {models.py:1298} INFO - Marking task as UP_FOR_RETRY
[2017-04-04 21:51:07,699] {models.py:1327} ERROR - Bash command failed
[2017-04-04 21:53:18,231] {models.py:154} INFO - Filling up the DagBag from /usr/local/airflow/dags/appannie_bash_operator_v10.py
[2017-04-04 21:53:21,470] {models.py:154} INFO - Filling up the DagBag from /usr/local/airflow/dags/appannie_bash_operator_v10.py
[2017-04-04 21:53:21,515] {models.py:1196} INFO - 
[2017-04-04 21:50:28996]{models.py:1219}INFO-于2017-04-03 18:00:00执行
[2017-04-04 21:50:29156]{bash_operator.py:55}INFO-tmp dir根位置:
/tmp
[2017-04-04 21:50:29157]{bash_operator.py:64}信息-临时脚本位置:/tmp/airrowtmp5_nqrG//tmp/airrowtmp5_nqrG/get_json0aZng2
[2017-04-04 21:50:29158]{bash_operator.py:65}INFO-运行命令:python~/appannie/appannie_scrap_update.py
[2017-04-04 21:50:29162]{bash_operator.py:73}信息-输出:
[2017-04-04 21:51:06057]{bash_operator.py:77}信息-JSON下载成功!
[2017-04-04 21:51:06368]{bash_operator.py:77}信息-回溯(最近一次呼叫):
[2017-04-04 21:51:06678]{bash_operator.py:77}INFO-File“/usr/local/aiffair/appannie/appannie_scrap_update.py”,第244行
[2017-04-0421:51:06679]{bash_operator.py:77}信息-上传_s3(x2)
[2017-04-04 21:51:06680]{bash_operator.py:77}INFO-文件“/usr/local/aiffair/appannie/appannie_scrape_update.py”,第41行,上传至s3
[2017-04-04 21:51:06680]{bash_operator.py:77}INFO-aabucket=s3.get_bucket('nixhydra-appannie',validate='False'))
[2017-04-04 21:51:06681]{bash_operator.py:77}INFO-File“/usr/local/aiffort/.local/lib/python2.7/site packages/boto/s3/connection.py”,第506行,在get_bucket中
[2017-04-04 21:51:06681]{bash_operator.py:77}INFO-返回self.head_bucket(bucket_name,headers=headers)
[2017-04-04 21:51:06682]{bash_operator.py:77}INFO-File“/usr/local/aiffair/.local/lib/python2.7/site packages/boto/s3/connection.py”,第539行,在head_bucket中
[2017-04-0421:51:06682]{bash_operator.py:77}信息-引发错误
[2017-04-04 21:51:06683]{bash_operator.py:77}INFO-boto.exception.S3ResponseError:S3ResponseError:403禁止
[2017-04-0421:51:06683]{bash_operator.py:77}信息-
[2017-04-04 21:51:06684]{bash_operator.py:80}INFO-命令已退出,返回代码为1
[2017-04-04 21:51:06685]{models.py:1286}错误-Bash命令失败
回溯(最近一次呼叫最后一次):
文件“/usr/local/lib/python2.7/dist-packages/afflow/models.py”,第1245行,运行中
结果=任务\复制.执行(上下文=上下文)
文件“/usr/local/lib/python2.7/dist packages/aiffort/operators/bash_operator.py”,执行中的第83行
raise AirflowException(“Bash命令失败”)
AirflowException:Bash命令失败
[2017-04-04 21:51:06686]{models.py:1298}信息-将任务标记为UP\u以供重试
[2017-04-04 21:51:07699]{models.py:1327}错误-Bash命令失败
[2017-04-04 21:53:18231]{models.py:154}信息-从/usr/local/aiffair/dags/appannie_bash_operator_v10.py填充行李
[2017-04-04 21:53:21470]{models.py:154}信息-从/usr/local/aiffair/dags/appannie_bash_operator_v10.py填充行李
[2017-04-0421:53:21515]{models.py:1196}信息-

您是否通过了python脚本所需的适当用户环境?@mootmoot您所说的用户环境是什么意思?这是dag配置吗?所有任务调度器都将加载另一个shell来运行这些程序。因此,如果您的程序需要显式库路径等,则必须指定它们。然而,当您使用boto和boto3时,Airflow似乎有额外的实现来读取凭证:请检查此项并遵循这些链接。或者在那里发布您的问题,我已经为我的脚本提供了访问s3所需的凭据,并且脚本能够独立运行。