Airflow 触发子DAG
已编辑 我通过考虑@tobi6的输入编辑了这个问题 我从源代码中复制了subdag操作符 源代码: 我修改了execute方法中的一些内容。所做的更改是为了触发子DAG并等待子DAG完成执行。触发器工作正常,但未执行任务(DAG处于运行/绿色状态,而任务处于空/白状态) 请参阅以下我所做的更改:Airflow 触发子DAG,airflow,Airflow,已编辑 我通过考虑@tobi6的输入编辑了这个问题 我从源代码中复制了subdag操作符 源代码: 我修改了execute方法中的一些内容。所做的更改是为了触发子DAG并等待子DAG完成执行。触发器工作正常,但未执行任务(DAG处于运行/绿色状态,而任务处于空/白状态) 请参阅以下我所做的更改: from airflow.exceptions import AirflowException from airflow.models import BaseOperator, Pool from ai
from airflow.exceptions import AirflowException
from airflow.models import BaseOperator, Pool
from airflow.utils.decorators import apply_defaults
from airflow.utils.db import provide_session
from airflow.utils.state import State
from airflow.executors import GetDefaultExecutor
from time import sleep
import logging
from datetime import datetime
class SubDagOperator(BaseOperator):
template_fields = tuple()
ui_color = '#555'
ui_fgcolor = '#fff'
@provide_session
@apply_defaults
def __init__(
self,
subdag,
executor=GetDefaultExecutor(),
*args, **kwargs):
"""
Yo dawg. This runs a sub dag. By convention, a sub dag's dag_id
should be prefixed by its parent and a dot. As in `parent.child`.
:param subdag: the DAG object to run as a subdag of the current DAG.
:type subdag: airflow.DAG
:param dag: the parent DAG
:type subdag: airflow.DAG
"""
import airflow.models
dag = kwargs.get('dag') or airflow.models._CONTEXT_MANAGER_DAG
if not dag:
raise AirflowException('Please pass in the `dag` param or call '
'within a DAG context manager')
session = kwargs.pop('session')
super(SubDagOperator, self).__init__(*args, **kwargs)
# validate subdag name
if dag.dag_id + '.' + kwargs['task_id'] != subdag.dag_id:
raise AirflowException(
"The subdag's dag_id should have the form "
"'{{parent_dag_id}}.{{this_task_id}}'. Expected "
"'{d}.{t}'; received '{rcvd}'.".format(
d=dag.dag_id, t=kwargs['task_id'], rcvd=subdag.dag_id))
# validate that subdag operator and subdag tasks don't have a
# pool conflict
if self.pool:
conflicts = [t for t in subdag.tasks if t.pool == self.pool]
if conflicts:
# only query for pool conflicts if one may exist
pool = (
session
.query(Pool)
.filter(Pool.slots == 1)
.filter(Pool.pool == self.pool)
.first()
)
if pool and any(t.pool == self.pool for t in subdag.tasks):
raise AirflowException(
'SubDagOperator {sd} and subdag task{plural} {t} both '
'use pool {p}, but the pool only has 1 slot. The '
'subdag tasks will never run.'.format(
sd=self.task_id,
plural=len(conflicts) > 1,
t=', '.join(t.task_id for t in conflicts),
p=self.pool
)
)
self.subdag = subdag
self.executor = executor
def execute(self, context):
dag_run = self.subdag.create_dagrun(
conf=context['dag_run'].conf,
state=State.RUNNING,
execution_date=context['execution_date'],
run_id='trig__' + str(datetime.utcnow()),
external_trigger=True
)
while True:
if dag_run.get_state() == State.FAILED or dag_run.get_state() == State.SUCCESS:
break
else:
sleep(10)
continue
下面的代码显示了我是如何使用相同的
from airflow import DAG
from operators.sd_operator import SubDagOperator # My SubDag Operator
from airflow.operators.python_operator import PythonOperator
import logging
from datetime import datetime
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2017, 7, 17),
'email': ['airflow@example.com'],
'email_on_failure': False,
'email_on_retry': False,
}
def print_dag_details(**kwargs):
logging.info(str(kwargs['dag_run'].conf))
with DAG('example_dag', schedule_interval=None, catchup=False, default_args=default_args) as dag:
task_1 = SubDagOperator(
subdag=sub_dag_func('example_dag', 'sub_dag_1'),
task_id='sub_dag_1'
)
task_2 = SubDagOperator(
subdag=sub_dag_func('example_dag', 'sub_dag_2'),
task_id='sub_dag_2',
)
print_kwargs = PythonOperator(
task_id='print_kwargs',
python_callable=print_dag_details,
provide_context=True
)
print_kwargs >> task_1 >> task_2
你提供的任何信息都会有帮助。提前谢谢 没有上下文理解你的问题有点困难 “我复制了subdag操作符并修改了execute方法中的一些内容。”
- 这是从哪里复制的
- 这看起来怎么样
- 将指定字段添加到sub_dag_func的函数调用中可能会有所帮助,例如
sub_dag_func(subdag='parent_dag'…)
- 在用于设置上游/下游的二进制班次定义中,有定义的任务我在DAG中找不到(
,df\u job\u 1
)。这可能连接到子DAG(尚未查看它们)df\u job\u 2
- 子dag的名称似乎与代码中的注释不一致,按照惯例,子dag的dag id应以其父级和点作为前缀,但它是
,子dag 1
子dag 2