Airflow 如何使用mysqltologlecloudstorageoperator从Xcom呈现值
我有以下代码:Airflow 如何使用mysqltologlecloudstorageoperator从Xcom呈现值,airflow,Airflow,我有以下代码: import_orders_op = MySqlToGoogleCloudStorageOperator( task_id='import_orders', mysql_conn_id='mysql_con', google_cloud_storage_conn_id='gcp_con', sql='SELECT * FROM orders where orders_id>{0};'.format(LAST_IMPORTED_ORDER_ID
import_orders_op = MySqlToGoogleCloudStorageOperator(
task_id='import_orders',
mysql_conn_id='mysql_con',
google_cloud_storage_conn_id='gcp_con',
sql='SELECT * FROM orders where orders_id>{0};'.format(LAST_IMPORTED_ORDER_ID),
bucket=GCS_BUCKET_ID,
filename=file_name,
dag=dag)
我想将查询更改为:
sql='SELECT * FROM orders where orders_id>{0} and orders_id<{1};'.format(LAST_IMPORTED_ORDER_ID, ...)
它给出:
损坏的DAG:未定义名称“任务\实例”
在dag文件中,您没有在dagrun上下文中主动使用现有任务实例 您只能在操作员运行时提取该值,而不是在设置该值时,后者的上下文由调度程序在循环中执行,并且每天将运行1000次,即使DAG每周运行一次或已禁用。但是你写的东西实际上非常接近于一些可能有效的东西,所以也许你已经考虑过这个背景点 让我们把它写成一个模板:
# YOUR EXAMPLE FORMATTED A BIT MORE 80 COLS SYTLE
…
sql='SELECT * FROM orders where orders_id>{0} and orders_id<{1}'.format(
LAST_IMPORTED_ORDER_ID,
{{ task_instance.xcom_pull(
task_ids=['get_max_order_id'], key='result_status') }}),
…
# SHOULD HAVE BEEN AT LEAST: I hope you can spot the difference.
…
sql='SELECT * FROM orders where orders_id>{0} and orders_id<{1}'.format(
LAST_IMPORTED_ORDER_ID,
"{{ task_instance.xcom_pull("
"task_ids=['get_max_order_id'], key='result_status') }}"),
…
# AND COULD HAVE BEEN MORE CLEARLY READABLE AS:
…
sql='''
SELECT *
FROM orders
WHERE orders_id > {{ params.last_imported_id }}
AND orders_id < {{ ti.xcom_pull('get_max_order_id') }}
''',
params={'last_imported_id': LAST_IMPORTED_ORDER_ID},
…
我知道您正在填充上次从气流变量导入的订单ID。您无法在dag文件中执行此操作,而是将{{params.last_imported_id}}更改为{var.value.last_imported_order_id}或您正在设置的气流变量的名称
# YOUR EXAMPLE FORMATTED A BIT MORE 80 COLS SYTLE
…
sql='SELECT * FROM orders where orders_id>{0} and orders_id<{1}'.format(
LAST_IMPORTED_ORDER_ID,
{{ task_instance.xcom_pull(
task_ids=['get_max_order_id'], key='result_status') }}),
…
# SHOULD HAVE BEEN AT LEAST: I hope you can spot the difference.
…
sql='SELECT * FROM orders where orders_id>{0} and orders_id<{1}'.format(
LAST_IMPORTED_ORDER_ID,
"{{ task_instance.xcom_pull("
"task_ids=['get_max_order_id'], key='result_status') }}"),
…
# AND COULD HAVE BEEN MORE CLEARLY READABLE AS:
…
sql='''
SELECT *
FROM orders
WHERE orders_id > {{ params.last_imported_id }}
AND orders_id < {{ ti.xcom_pull('get_max_order_id') }}
''',
params={'last_imported_id': LAST_IMPORTED_ORDER_ID},
…