Airflow 连接到sql server选择结果到数据框

Airflow 连接到sql server选择结果到数据框,airflow,airflow-operator,Airflow,Airflow Operator,将sql查询读取到数据帧 我正在尝试连接到SQLServerLocal以从表中获取数据,并使用pandas操作处理数据,但我无法确定如何将select查询结果传递到数据帧 下表用于清除表中的数据 ``` sql_command = """ DELETE FROM [TestDB].[dbo].[PythonTestData] """ t3 = MsSqlOperator( task_id = 'run_test_proc',

将sql查询读取到数据帧

我正在尝试连接到SQLServerLocal以从表中获取数据,并使用pandas操作处理数据,但我无法确定如何将select查询结果传递到数据帧 下表用于清除表中的数据

``` sql_command = """ DELETE FROM [TestDB].[dbo].[PythonTestData] """

t3 = MsSqlOperator( task_id = 'run_test_proc',
                mssql_conn_id = 'mssql_local',
                sql = sql_command,
                dag = dag,
                database = 'TestDB',
                autocommit = True) ```
预定的熊猫是


query = 'SELECT * FROM [ClientData] '#where  product_name='''+i+''''''

df = pd.read_sql(query, conn)
pn_list = df['ClientID'].tolist()
#print("The original pn_list is : " + str(pn_list))
for i in pn_list:
    varw= str(i)
    queryw = 'SELECT * FROM [ClientData] where  [ClientID]='''+varw+''
    dfw = pd.read_sql(queryw, conn)
    dfw = dfw.applymap(str)
    cols=['product_id','product_name','brand_id']
    x=dfw.values.tolist()
    x=x[0]
    ClientID=x[0]
    Name=x[1]
    Org=x[2]
    Email=x[3]
    #print('Name :'+Name+'   ,'+'Org :'+Org+'   ,'+'Email :'+Email+'    ,'+'ClientID :'+ClientID)
    salesData_qry= 'SELECT * FROM [TestDB].[dbo].[SalesData] where  [ClientID]='''+ClientID+''
    salesData_df= pd.read_sql(salesData_qry, conn)
    salesData_df['year1'] = salesData_df['Order Date'].dt.strftime('%Y')
    salesData_df['OrderMonth'] =  salesData_df['Order Date'].dt.strftime('%b')
    filename ='Daily_Campaign_Report_'+Name+'_'+Org+'_'+datetime.now().strftime("%Y%m%d_%H%M%S")
    p = Path('C:/Users/user/Documents/WorkingData/')
    salesData_df.to_csv(Path(p,  filename + '.csv'))```

Please point me to correct approach as i m new to airflow 


我不太清楚如何生成查询代码,但为了从MsSQL获取数据帧,需要使用MSSQLShook:


这是我用于dag的代码

def mssql_func(**kwargs):
    conn = MsSqlHook.get_connection(conn_id="mssql_local")
    hook = conn.get_hook()
    df = hook.get_pandas_df(sql="SELECT * FROM [TestDB].[dbo].[ClientData]")
    #do whatever you need on the df
    print(df)

run_this = PythonOperator(
    task_id='mssql_task',
    python_callable=mssql_func,
    dag=dag
)
错误日志

[2021-01-12 16:07:15,114] {providers_manager.py:159} WARNING - The provider for package 'apache-airflow-providers-imap' could not be registered from because providers for that package name have already been registered
[2021-01-12 16:07:15,618] {base.py:65} INFO - Using connection to: id: mssql_local. Host: localhost, Port: 1433, Schema: dbo, Login: sa, Password: XXXXXXXX, extra: None
[2021-01-12 16:07:15,626] {taskinstance.py:1396} ERROR - (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")
Traceback (most recent call last):
  File "src/pymssql.pyx", line 636, in pymssql.connect
  File "src/_mssql.pyx", line 1964, in _mssql.connect
  File "src/_mssql.pyx", line 682, in _mssql.MSSQLConnection.__init__
  File "src/_mssql.pyx", line 1690, in _mssql.maybe_raise_MSSQLDatabaseException
_mssql.MSSQLDatabaseException: (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")

我收到一个错误db failed to connect,但是我在上面脚本中使用的相同连接工作,并且connect{taskinstance.py:1396}错误-18456,用户'sa'的bLogin失败。db Lib错误消息20018,严重性14:\n一般SQL Server错误:检查来自SQL Server的消息\nDB Lib错误消息20002,严重性9:\n适配器服务器连接本地主机失败\n数据库错误消息20002,严重性9:\n适配器服务器连接失败localhost\n回溯最近一次调用上次:@FaraiHuruba see editStill failing与下面相同的错误是来自airflow的整个dag代码。模型从airflow.providers.microsoft.mssql.hooks.mssql从airflow.operators.python\u operator导入PythonOperator从datetime导入datetime、timedelta。。。def mssql\u func**kwargs:hook=MsSqlHookconn\u id='mssql\u local'df=hook.get\u pandas\u dfsql=SELECT*FROM[TestDB].[dbo].[ClientData]在df printdf运行时执行您需要的任何操作\u this=PythonOperator任务\u id='mssql\u task',python\u callable=mssql\u func,dag=dag run_this您用于MSSQLSOperator的导入路径是什么?来自aiffair.providers.microsoft.mssql.operators.mssql导入MSSQLSOperator不如果这是您正在查看的,因为我看到我们在dag上的任何位置都不使用它
[2021-01-12 16:07:15,114] {providers_manager.py:159} WARNING - The provider for package 'apache-airflow-providers-imap' could not be registered from because providers for that package name have already been registered
[2021-01-12 16:07:15,618] {base.py:65} INFO - Using connection to: id: mssql_local. Host: localhost, Port: 1433, Schema: dbo, Login: sa, Password: XXXXXXXX, extra: None
[2021-01-12 16:07:15,626] {taskinstance.py:1396} ERROR - (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")
Traceback (most recent call last):
  File "src/pymssql.pyx", line 636, in pymssql.connect
  File "src/_mssql.pyx", line 1964, in _mssql.connect
  File "src/_mssql.pyx", line 682, in _mssql.MSSQLConnection.__init__
  File "src/_mssql.pyx", line 1690, in _mssql.maybe_raise_MSSQLDatabaseException
_mssql.MSSQLDatabaseException: (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")