Python 使用通过函数为气流任务创建的变量_Python_Airflow_Directed Acyclic Graphs

Python 使用通过函数为气流任务创建的变量

python airflow

Python 使用通过函数为气流任务创建的变量,python,airflow,directed-acyclic-graphs,Python,Airflow,Directed Acyclic Graphs,我有一个变量，每次执行下面的函数时，我都必须生成该变量的值，它的响应用于组成我接下来需要执行的查询。在我正在共享的这个示例中，我认为这不是获取信息的正确方法，因为它是在执行任务bq_diff_id之前生成的。是否有人建议在上一个任务的结果之后获取此变量 def get_data_from_bigquery(): """query bigquery to get data to import to PSQL""" bq = bigquery.Client() #IDs

我有一个变量，每次执行下面的函数时，我都必须生成该变量的值，它的响应用于组成我接下来需要执行的查询。在我正在共享的这个示例中，我认为这不是获取信息的正确方法，因为它是在执行任务bq_diff_id之前生成的。是否有人建议在上一个任务的结果之后获取此变量

def get_data_from_bigquery():
    """query bigquery to get data to import to PSQL"""
    bq = bigquery.Client()
    #IDs
    query = """SELECT ID FROM dataset.table1"""
    query_job = bq.query(query)
    data = query_job.result()
    rows = list(data)
    diff1 = str(tuple(np.array(rows).T.tolist()))
    diff = diff1.replace("[", "").replace("]", "").replace(",)",")")
    #Count 
    count_query = """SELECT count(*) as qtt FROM dataset.table1"""
    count_query_job = bq.query(count_query)
    count_data = count_query_job.result()
    count_rows = list(count_data)
    count_end = str(count_rows[0][0])
    if int(count_end) <= 0:
        query = 'id is null'
        return query
    else:
        query = 'id in ' + diff
        return query

Python_1 = PythonOperator(task_id='bq_diff_id',
    python_callable=get_data_from_bigquery,
    dag=dag)

query = get_data_from_bigquery()

sql_query = """select id as id, \
value \
from table2 where """ + query + """ """ #Query for extract

MsSql = MsSqlToGoogleCloudStorageOperator(
    task_id='import_orders',
    mssql_conn_id=mssql_connection,
    google_cloud_storage_conn_id='gcp',
    sql=sql_query,
    bucket=nm_bucket,
    filename=nm_arquivo,
    schema_filename=sc_arquivo,
    dag=dag)

有什么建议吗？我不明白你的问题，但是如果你想通过编程创建一个变量，那么你可以参考。例如，即使是细粒度的代码，也请参见有关以编程方式创建连接的答案