Airflow 气流-触发时不执行脚本

Airflow 气流-触发时不执行脚本,airflow,airflow-scheduler,Airflow,Airflow Scheduler,我有一个airflow脚本,试图将数据从一个表插入到另一个表,我使用的是Amazon Redshift DB。触发时不会执行下面给定的脚本。任务id状态在图形视图中保持为“无状态”,并且不显示其他错误 ## Third party Library Imports import psycopg2 import airflow from airflow import DAG from airflow.operators.python_operator import PythonOperator f

我有一个airflow脚本,试图将数据从一个表插入到另一个表,我使用的是Amazon Redshift DB。触发时不会执行下面给定的脚本。任务id状态在图形视图中保持为“无状态”,并且不显示其他错误

## Third party Library Imports

import psycopg2
import airflow
from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
from sqlalchemy import create_engine
import io


# Following are defaults which can be overridden later on
default_args = {
'owner': 'airflow',
'depends_on_past': False,
'start_date': datetime(2017, 1, 23, 12),
'email': ['airflow@airflow.com'],
'email_on_failure': False,
'email_on_retry': False,
'retries': 1,
'retry_delay': timedelta(minutes=5),
}

dag = DAG('sample_dag', default_args=default_args, catchup=False, schedule_interval="@once")


#######################
## Login to DB

def db_login():
    global db_conn
    try:
        db_conn = psycopg2.connect(
        " dbname = 'name' user = 'user' password = 'pass' host = 'host' port = '5439' sslmode = 'require' ")
    except:
        print("I am unable to connect to the database.")
        print('Connection Task Complete: Connected to DB')
        return (db_conn)


#######################

def insert_data():
    cur = db_conn.cursor()
    cur.execute("""insert into tbl_1 select id,bill_no,status from tbl_2 limit 2 ;""")
    db_conn.commit()
    print('ETL Task Complete')

def job_run():
    db_login()
    insert_data()

##########################################

t1 = PythonOperator(
task_id='DBConnect',
python_callable=job_run,
bash_command='python3 ~/airflow/dags/sample.py',
dag=dag)

t1
有人能帮忙找出问题所在吗。谢谢

更新代码(05/28)

运行脚本时记录消息

[2018-05-28 11:36:45,300] {jobs.py:343} DagFileProcessor26 INFO - Started process (PID=26489) to work on /Users/user/airflow/dags/sample.py
[2018-05-28 11:36:45,306] {jobs.py:534} DagFileProcessor26 ERROR - Cannot use more than 1 thread when using sqlite. Setting max_threads to 1
[2018-05-28 11:36:45,310] {jobs.py:1521} DagFileProcessor26 INFO - Processing file /Users/user/airflow/dags/sample.py for tasks to queue
[2018-05-28 11:36:45,310] {models.py:167} DagFileProcessor26 INFO - Filling up the DagBag from /Users/user/airflow/dags/sample.py
/Users/user/anaconda3/lib/python3.6/site-packages/psycopg2/__init__.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: <http://initd.org/psycopg/docs/install.html#binary-install-from-pypi>.
  """)
Task Complete: Insert success
[2018-05-28 11:36:50,964] {jobs.py:1535} DagFileProcessor26 INFO - DAG(s) dict_keys(['latest_only', 'example_python_operator', 'test_utils', 'example_bash_operator', 'example_short_circuit_operator', 'example_branch_operator', 'tutorial', 'example_passing_params_via_test_command', 'latest_only_with_trigger', 'example_xcom', 'example_http_operator', 'example_skip_dag', 'example_trigger_target_dag', 'example_branch_dop_operator_v3', 'example_subdag_operator', 'example_subdag_operator.section-1', 'example_subdag_operator.section-2', 'example_trigger_controller_dag', 'insert_data2']) retrieved from /Users/user/airflow/dags/sample.py
[2018-05-28 11:36:51,159] {jobs.py:1169} DagFileProcessor26 INFO - Processing example_subdag_operator
[2018-05-28 11:36:51,167] {jobs.py:566} DagFileProcessor26 INFO - Skipping SLA check for <DAG: example_subdag_operator> because no tasks in DAG have SLAs
[2018-05-28 11:36:51,170] {jobs.py:1169} DagFileProcessor26 INFO - Processing sample_dag
[2018-05-28 11:36:51,174] {jobs.py:354} DagFileProcessor26 ERROR - Got an exception! Propagating...
Traceback (most recent call last):
  File "/Users/user/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 346, in helper
pickle_dags)
  File "/Users/user/anaconda3/lib/python3.6/site-packages/airflow/utils/db.py", line 53, in wrapper
result = func(*args, **kwargs)
  File "/Users/user/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 1581, in process_file
self._process_dags(dagbag, dags, ti_keys_to_schedule)
  File "/Users/user/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 1171, in _process_dags
dag_run = self.create_dag_run(dag)
  File "/Users/user/anaconda3/lib/python3.6/site-packages/airflow/utils/db.py", line 53, in wrapper
result = func(*args, **kwargs)
  File "/Users/user/anaconda3/lib/python3.6/site-packages/airflow/jobs.py", line 776, in create_dag_run
if next_start <= now:
TypeError: '<=' not supported between instances of 'NoneType' and 'datetime.datetime'
[2018-05-28 11:36:45300]{jobs.py:343}DagFileProcessor26信息-已启动进程(PID=26489)以处理/Users/user/aiffair/dags/sample.py
[2018-05-28 11:36:45306]{jobs.py:534}DagFileProcessor26错误-使用sqlite时不能使用多个线程。将最大线程数设置为1
[2018-05-28 11:36:45310]{jobs.py:1521}DagFileProcessor26信息处理文件/Users/user/afflow/dags/sample.py,用于任务排队
[2018-05-28 11:36:45310]{models.py:167}DagFileProcessor26信息-从/Users/user/aiffair/dags/sample.py填充DagBag
/Users/user/anaconda3/lib/python3.6/site packages/psycopg2/_init__u;.py:144:UserWarning:psycopg2控制盘包将从2.8版重命名;为了保持从二进制文件安装,请改用“pip安装psycopg2二进制文件”。有关详细信息,请参阅:。
""")
任务完成:插入成功
[2018-05-2811:36:50964]{jobs.py:1535}DagFileProcessor26信息-DAG(s)dict_键([‘仅限最新’、‘仅限python_运算符示例’、‘测试_实用程序’、‘仅限bash_运算符示例’、‘短路_运算符示例’、‘分支_运算符示例’、‘教程’、‘通过_测试_命令传递参数示例’、‘仅限最新_带_触发器’、‘示例_xcom’、‘http_运算符示例_跳过_dag’、‘触发目标_dag’、‘分支_dop运算符示例_v3’从/Users/user/afflow/dags/sample.py检索的“,”示例子运算符“,”示例子运算符“,”示例子运算符.section-1“,”示例子运算符.section-2“,”示例触发器控制器“,”插入数据2'])
[2018-05-28 11:36:51159]{jobs.py:1169}DagFileProcessor26信息处理示例
[2018-05-28 11:36:51167]{jobs.py:566}DagFileProcessor26信息-跳过的SLA检查,因为DAG中没有任务具有SLA
[2018-05-2811:36:51170]{jobs.py:1169}DagFileProcessor26信息处理示例
[2018-05-28 11:36:51174]{jobs.py:354}DagFileProcessor26错误-出现异常!正在传播。。。
回溯(最近一次呼叫最后一次):
文件“/Users/user/anaconda3/lib/python3.6/site packages/afflow/jobs.py”,第346行,在helper中
泡菜
文件“/Users/user/anaconda3/lib/python3.6/site packages/afflow/utils/db.py”,第53行,在包装器中
结果=函数(*args,**kwargs)
文件“/Users/user/anaconda3/lib/python3.6/site packages/afflow/jobs.py”,第1581行,在进程文件中
自身流程(dagbag、dags、TIU键到时间表)
文件“/Users/user/anaconda3/lib/python3.6/site packages/afflow/jobs.py”,第1171行,在进程中
dag_运行=自。创建dag_运行(dag)
文件“/Users/user/anaconda3/lib/python3.6/site packages/afflow/utils/db.py”,第53行,在包装器中
结果=函数(*args,**kwargs)
文件“/Users/user/anaconda3/lib/python3.6/site packages/afflow/jobs.py”,第776行,在create\u dag\u run中

如果下次启动而不是使用
PythonOperator
,则需要使用
bash操作符
PythonOperator

出现错误是因为
PythonOperator
没有
bash\u命令
参数

t1=PythonOperator(
任务_id='DBConnect',
python\u callable=db\u login,
dag=dag
)
t2=bash算子(
task_id='Run Python File',
bash_command='python3~/aiffort/dags/sample.py',
dag=dag
)
t1>>t2

对于kaxil提供的答案,我想扩展一下,您应该使用IDE开发气流。PyCharm对我来说很好

话虽如此,请确保下次在文档中查找可用字段。对于PythonOperator,请参阅此处的文档:

签名看起来像:

类afflow.operators.PythonOperator(python\u callable,op\u args=None,op\u kwargs=None,provide\u context=False,templates\u dict=None,templates\u exts=None,*args,**kwargs)

对于BashOperator,请参见此处的文档:

签名为:

类afflow.operators.BashOperator(bash_命令,xcom_push=False,env=None,output_encoding='utf-8',*args,**kwargs)

高光来自我,用于显示您一直使用的参数

我建议在使用操作符之前一定要仔细阅读一下文档

编辑

看到代码更新后,剩下一件事:

在任务中定义
python\u callable
时,请确保不使用括号,否则将调用代码(如果您不知道,这是非常不直观的)。因此,您的代码应该如下所示:

t1 = PythonOperator(
    task_id='DWH_Connect',
    python_callable=job_run,
    dag=dag)

@tobi6,我已经将两个函数合并为一个函数,但仍然存在相同的问题,即当作业被触发时不会发生任何事情。我用更新的代码编辑了第一篇文章。。Tnx@darkhorse你在气流UI中打开dag了吗?@Chengzhi,是的,dag在UI中打开了为什么你有
bash\u command='python3~/afflow/dags/sample。PythonOperator中的py'
?我建议您运行
python your_dag_file.py来找出任何编译错误?您的dag文件是不正确的,这就是为什么airflow不会为您安排它。正如它所说的:无效参数是:*args:()**kwargs:{'bash_命令':'python3~/airflow/dags/sample.py'},因为您使用的是PythonOperator,所以没有仅在BashOperator中可用的bash_命令。不要混合使用它们,如果您想运行bash命令,请在airflow和correct operator中使用正确的依赖项。@tobi6同意。@dark horse如果
sample.py
是DAG文件,则不需要此调用。但如果它是正常的python要执行的脚本仍将使用
bash操作符
来执行
t1 = PythonOperator(
    task_id='DWH_Connect',
    python_callable=job_run,
    dag=dag)