Python ApacheXcom从动态任务名称中提取
我已经成功地在DAG(Bash和Docker操作符)中创建了动态任务,但是我很难将这些动态创建的任务传递给xcom_pull来获取数据Python ApacheXcom从动态任务名称中提取,python,jinja2,airflow,Python,Jinja2,Airflow,我已经成功地在DAG(Bash和Docker操作符)中创建了动态任务,但是我很难将这些动态创建的任务传递给xcom_pull来获取数据 for i in range(0, max_tasks): task_scp_queue = BashOperator(task_id="scp_queue_task_{}".format(i), bash_command="""python foo""", retries=3, dag=dag, pool="scp_queue_pool", queue
for i in range(0, max_tasks):
task_scp_queue = BashOperator(task_id="scp_queue_task_{}".format(i), bash_command="""python foo""", retries=3, dag=dag, pool="scp_queue_pool", queue="foo", provide_context=True, xcom_push=True) # Pull the manifest ID from the previous task via xcom'
task_process_queue = DockerOperator(task_id="process_task_{}".format(i), command="""python foo --queue-name={{ task_instance.xcom_pull(task_ids=scp_queue_task_{}) }}""".format(i), retries=3, dag=dag, pool="process_pool", api_version="auto", image="foo", queue="foo", execution_timeout=timedelta(minutes=5))
task_manifest = DockerOperator(api_version="auto", task_id="manifest_task_{}".format(i), image="foo", retries=3, dag=dag, command=""" python --manifestid={{ task_instance.xcom_pull(task_ids=scp_queue_task_{}) }}""".format(i), pool="manfiest_pool", queue="d_parser")
task_psql_queue.set_downstream(task_scp_queue)
task_process_queue.set_upstream(task_scp_queue)
task_manifest.set_upstream(task_process_queue)
正如您所看到的,我尝试在Jinja模板中使用Python格式字符串来传递I变量,但是这不起作用
我也尝试过使用“task.task\u id”,并创建一个只包含task\u id的新字符串,但这也不起作用
编辑:
现在命令看起来像这样
command="""python foo \
--queue-name="{{
task_instance.xcom_pull(task_ids='scp_queue_task_{}') }}"
""".format(i)
我的气流调试日志如下所示
Using Master Queue: process_{
task_instance.xcom_pull(task_ids='scp_queue_task_31') }
所以字符串值正在填充,但它没有执行xcom\u pull。我不明白这是怎么回事。记录您所收到的错误会很有帮助 简言之,如果
max_tasks=2
您所做的看起来不错,您将获得:
task_psql_queue.taskid --> scp_queue_task_0 >> process_task_0 >> manifest_task_0
\-> scp_queue_task_1 >> process_task_1 >> manifest_task_1
我想你不需要暂停,因为暂停时间很短。因为您有很长的行,并且随机重新排列命名参数,所以我将重新格式化您所写的内容:
for i in range(0, max_tasks):
task_scp_queue = BashOperator(
task_id="scp_queue_task_{}".format(i),
dag=dag,
retries=3, # you could make it a default arg on the dag
pool="scp_queue_pool",
queue="foo", # you really want both queue and pool? When debugging remove them.
bash_command="python foo", # Maybe you snipped a multiline command
provide_context=True, # BashOp doesn't have this argument
xcom_push=True, # PUSH the manifest ID FOR the NEXT task via xcom
)
task_process_queue = DockerOperator(
task_id="process_task_{}".format(i),
dag=dag,
retries=3,
pool="process_pool",
queue="foo",
execution_timeout=timedelta(minutes=5),
api_version="auto",
image="foo",
command="python foo --queue-name="
"{{{{ task_instance.xcom_pull(task_ids=scp_queue_task_{}) }}}}".format(i),
)
task_manifest = DockerOperator(
task_id="manifest_task_{}".format(i),
retries=3,
dag=dag,
pool="manfiest_pool",
queue="d_parser",
api_version="auto",
image="foo",
command="python --manifestid="
"{{{{ task_instance.xcom_pull(task_ids=scp_queue_task_{}) }}}}".format(i),
)
task_psql_queue >> task_scp_queue >> task_process_queue >> task_manifest
哦,现在看,您没有将任务ID
作为字符串传递。尝试:
command="python foo --queue-name="
"{{{{ task_instance.xcom_pull(task_ids='scp_queue_task_{}') }}}}".format(i),
… … …
command="python --manifestid="
"{{{{ task_instance.xcom_pull(task_ids='scp_queue_task_{}') }}}}".format(i),
谢谢你的帮助!我在上面对我尝试的内容和结果进行了编辑。@gleb1783对不起,我以前的回答忘记了
“{{blah{}}}”。格式(1)
产生“{blah{u 1}”
。我需要将那些双花括号(例如“{{{{{{{blah{}}}}}}”翻四倍。格式(1)
变成“{{{blah{u 1}}}}”
这种格式也适用于bash命令操作符。我自己刚刚遇到过这个用例。