Python 如何使用AWS GlueOperator触发粘合作业

Python 如何使用AWS GlueOperator触发粘合作业,python,airflow,aws-glue,airflow-operator,Python,Airflow,Aws Glue,Airflow Operator,我的脚本只有一个任务来触发粘合作业。我能够创建DAG。下面是我的DAG代码 from airflow import DAG from airflow.operators.email_operator import EmailOperator from airflow.providers.amazon.aws.operators.glue import AwsGlueJobOperator from datetime import datetime, timedelta ### glue jo

我的脚本只有一个任务来触发粘合作业。我能够创建DAG。下面是我的DAG代码

from airflow import DAG
from airflow.operators.email_operator import EmailOperator
from airflow.providers.amazon.aws.operators.glue import AwsGlueJobOperator
from datetime import datetime, timedelta


### glue job specific variables
glue_job_name = "my_glue_job"
glue_iam_role = "AWSGlueServiceRole"
region_name = "us-west-2"
email_recipient = "me@gmail.com"

default_args = {
    'owner': 'me',
    'start_date': datetime(2020, 1, 1),
    'retry_delay': timedelta(minutes=5),
    'email': email_recipient,
    'email_on_failure': True
}


with DAG(dag_id = 'glue_af_pipeline', default_args = default_args, schedule_interval = None) as dag:
    
    glue_job_step = AwsGlueJobOperator(
        job_name =glue_job_name,
        script_location = 's3://my-s3-location',
        region_name = region_name,
        iam_role_name = glue_iam_role,
        script_args=None,
        num_of_dpus=10,
        task_id = 'glue_job_step',
        dag = dag
        )
   
    glue_job_step
当我运行DAG时,它失败并给出以下错误:

[2020-10-13 08:27:14315]{logging_mixin.py:112}INFO-[2020-10-13] 08:27:14315]{glue.py:114}错误-无法运行aws glue作业,错误: 参数验证失败:参数参数的类型无效, 值:[],类型:,有效类型: [2020-10-13 08:27:14315]{taskinstance.py:1058}错误-参数 验证失败:参数参数的类型无效,值:[], 类型:,有效类型:回溯(most) 最近调用(最后一次):文件 “/usr/local/lib/python3.8/site packages/aiffort/models/taskinstance.py”, 第930行,运行原始任务 result=task_copy.execute(context=context)文件“/usr/local/lib/python3.8/site-packages/aiffort/providers/amazon/aws/operators/glue.py”, 执行中的第115行 glue\u job\u run=glue\u job.initialize\u job(self.script\u args)文件“/usr/local/lib/python3.8/site packages/aiffort/providers/amazon/aws/hooks/glue.py”, 第111行,在初始化作业中 job\u run=glue\u client.start\u job\u run(JobName=job\u name,Arguments=script\u Arguments)文件 “/usr/local/lib/python3.8/site packages/botocore/client.py”,第337行, in_api_调用 返回self.\u make\u api\u call(操作名称,kwargs)文件“/usr/local/lib/python3.8/site packages/botocore/client.py”,第628行, 在"make"api"调用中 请求dict=self.\u将请求转换为请求dict(文件“/usr/local/lib/python3.8/site packages/botocore/client.py”,第676行, 在"转换"到"请求"目录中 request_dict=self._serializer.serialize_to_request(文件“/usr/local/lib/python3.8/site packages/botocore/validate.py”,第行 297,在序列化_到_请求中 raise ParamValidationError(report=report.generate_report())botocore.exceptions.ParamValidationError:参数验证失败: 参数参数的类型无效,值:[],类型:, 有效类型:[2020-10-13 08:27:14316] {taskinstance.py:1089}INFO-将任务标记为失败


非常感谢您的建议。

如果您正在运行现有的
GlueJob
请尝试以下操作

glue_job_step = AwsGlueJobOperator(
        task_id = "glue_job_step",
        job_name = glue_job_name,
        job_desc = f"triggering glue job {glue_job_name}",
        region_name = region_name,
        iam_role_name = glue_iam_role,
        num_of_dpus = 1,
        dag = dag
        )

如果没有输入参数,请删除
script\u args

如果正在运行现有的
GlueJob
请尝试此操作

glue_job_step = AwsGlueJobOperator(
        task_id = "glue_job_step",
        job_name = glue_job_name,
        job_desc = f"triggering glue job {glue_job_name}",
        region_name = region_name,
        iam_role_name = glue_iam_role,
        num_of_dpus = 1,
        dag = dag
        )
如果没有输入参数,请删除脚本参数