Airflow 如何修复气流错误:print_context()缺少1个必需的位置参数:“ds”

Airflow 如何修复气流错误:print_context()缺少1个必需的位置参数:“ds”,airflow,directed-acyclic-graphs,Airflow,Directed Acyclic Graphs,我有一个dag,如下所示: 摄取_excel.py: from __future__ import print_function import time from builtins import range from datetime import timedelta from pprint import pprint import airflow from airflow.models import DAG #from airflow.operators.bash_operator imp

我有一个dag,如下所示: 摄取_excel.py:

from __future__ import print_function

import time
from builtins import range
from datetime import timedelta
from pprint import pprint

import airflow
from airflow.models import DAG
#from airflow.operators.bash_operator import BashOperator
from airflow.operators.python_operator import PythonOperator

args = {
    'owner': 'rxie',
    'start_date': airflow.utils.dates.days_ago(2),
}

dag = DAG(
    dag_id='ingest_excel',
    default_args=args,
    schedule_interval='0 0 * * *',
    dagrun_timeout=timedelta(minutes=60),
)

def print_context(**kwargs):
    pprint("DAG info below:")
    pprint(kwargs)
    return 'Whatever you return gets printed in the logs'


t11_extract_excel_to_csv = PythonOperator(
    task_id='t1_extract_excel_to_csv',
    provide_context=True,
    python_callable=print_context(),
    op_kwargs=None,
    dag=dag,
)


t12_upload_csv_to_hdfs_parquet = PythonOperator(
    task_id='t12_upload_csv_to_hdfs_parquet',
    provide_context=True,
    python_callable=print_context(),
    op_kwargs=None,
    dag=dag,
)


t13_register_parquet_to_impala = PythonOperator(
    task_id='t13_register_parquet_to_impala',
    provide_context=True,
    python_callable=print_context(),
    op_kwargs=None,
    dag=dag,
)

t21_text_to_parquet = PythonOperator(
    task_id='t21_text_to_parquet',
    provide_context=True,
    python_callable=print_context(),
    op_kwargs=None,
    dag=dag,
)

t22_register_parquet_to_impala = PythonOperator(
    task_id='t22_register_parquet_to_impala',
    provide_context=True,
    python_callable=print_context(),
    op_kwargs=None,
    dag=dag,
)

t31_verify_completion = PythonOperator(
    task_id='t31_verify_completion',
    provide_context=True,
    python_callable=print_context(),
    op_kwargs=None,
    dag=dag,
)

t32_send_notification = PythonOperator(
    task_id='t32_send_notification',
    provide_context=True,
    python_callable=print_context(),
    op_kwargs=None,
    dag=dag,
)

t11_extract_excel_to_csv >> t12_upload_csv_to_hdfs_parquet
t12_upload_csv_to_hdfs_parquet >> t13_register_parquet_to_impala

t21_text_to_parquet >> t22_register_parquet_to_impala


t13_register_parquet_to_impala >> t31_verify_completion
t22_register_parquet_to_impala >> t31_verify_completion

t31_verify_completion >> t32_send_notification


#if __name__ == "__main__":
#    dag.cli()
在DAG GUI中,它会提示:

断开的DAG:[/root/aiffort/dags/ingest\u excel.py]python\u可调用 param必须是可调用的

这是我的第一个dag在气流,我是相当新的气流,它将非常感谢如果有人能给我一些光,并为我整理它


提前谢谢。

我不太清楚为什么您的代码不起作用。它应该可以工作,但是下面给出了一个解决方法

def print_context(**kwargs):
ds = kwargs['ds']
另外,python_callable应该像这样传递

python_callable=print_context,

我不完全清楚为什么你的代码不起作用。它应该可以工作,但是下面给出了一个解决方法

def print_context(**kwargs):
ds = kwargs['ds']
另外,python_callable应该像这样传递

python_callable=print_context,

详细说明您的问题:您的进程已中断,因为您没有将函数print_context传递给PythonOperator,而是传递调用print_context的结果:


…您将收到您看到的错误。另一位撰稿人正确地指出,您应该将PythonOperator.python\u callable关键字参数更改为仅打印上下文来详细说明您的问题:您的进程被中断,因为您没有将函数打印上下文传递给PythonOperator,您正在传递调用打印上下文的结果:


…您将收到您看到的错误。另一位参与者正确地指出,您应该将PythonOperator.python\u callable关键字参数更改为仅打印上下文。在较新的版本中,需要将以下选项传递给PythonOperator:

provide_context=True
否则,ds参数不会传递给函数。这是我最近遇到的气流变化

完整示例:

def print_context(ds, **kwargs):
    pprint(kwargs)
    print(ds)
    return 'Whatever you return gets printed in the logs'


run_this = PythonOperator(
    task_id='print_the_context',
    provide_context=True,
    python_callable=print_context,
    dag=dag,
)

在较新版本的airflow中,需要将以下选项传递给PythonOperator:

provide_context=True
否则,ds参数不会传递给函数。这是我最近遇到的气流变化

完整示例:

def print_context(ds, **kwargs):
    pprint(kwargs)
    print(ds)
    return 'Whatever you return gets printed in the logs'


run_this = PythonOperator(
    task_id='print_the_context',
    provide_context=True,
    python_callable=print_context,
    dag=dag,
)

非常感谢你。删除已解决此问题。非常感谢。删除已解决问题。谢谢您的详细说明,joeb谢谢您的详细说明,joeb