Airflow 连接到sql server选择结果到数据框
将sql查询读取到数据帧 我正在尝试连接到SQLServerLocal以从表中获取数据,并使用pandas操作处理数据,但我无法确定如何将select查询结果传递到数据帧 下表用于清除表中的数据Airflow 连接到sql server选择结果到数据框,airflow,airflow-operator,Airflow,Airflow Operator,将sql查询读取到数据帧 我正在尝试连接到SQLServerLocal以从表中获取数据,并使用pandas操作处理数据,但我无法确定如何将select查询结果传递到数据帧 下表用于清除表中的数据 ``` sql_command = """ DELETE FROM [TestDB].[dbo].[PythonTestData] """ t3 = MsSqlOperator( task_id = 'run_test_proc',
``` sql_command = """ DELETE FROM [TestDB].[dbo].[PythonTestData] """
t3 = MsSqlOperator( task_id = 'run_test_proc',
mssql_conn_id = 'mssql_local',
sql = sql_command,
dag = dag,
database = 'TestDB',
autocommit = True) ```
预定的熊猫是
query = 'SELECT * FROM [ClientData] '#where product_name='''+i+''''''
df = pd.read_sql(query, conn)
pn_list = df['ClientID'].tolist()
#print("The original pn_list is : " + str(pn_list))
for i in pn_list:
varw= str(i)
queryw = 'SELECT * FROM [ClientData] where [ClientID]='''+varw+''
dfw = pd.read_sql(queryw, conn)
dfw = dfw.applymap(str)
cols=['product_id','product_name','brand_id']
x=dfw.values.tolist()
x=x[0]
ClientID=x[0]
Name=x[1]
Org=x[2]
Email=x[3]
#print('Name :'+Name+' ,'+'Org :'+Org+' ,'+'Email :'+Email+' ,'+'ClientID :'+ClientID)
salesData_qry= 'SELECT * FROM [TestDB].[dbo].[SalesData] where [ClientID]='''+ClientID+''
salesData_df= pd.read_sql(salesData_qry, conn)
salesData_df['year1'] = salesData_df['Order Date'].dt.strftime('%Y')
salesData_df['OrderMonth'] = salesData_df['Order Date'].dt.strftime('%b')
filename ='Daily_Campaign_Report_'+Name+'_'+Org+'_'+datetime.now().strftime("%Y%m%d_%H%M%S")
p = Path('C:/Users/user/Documents/WorkingData/')
salesData_df.to_csv(Path(p, filename + '.csv'))```
Please point me to correct approach as i m new to airflow
我不太清楚如何生成查询代码,但为了从MsSQL获取数据帧,需要使用MSSQLShook:
这是我用于dag的代码
def mssql_func(**kwargs):
conn = MsSqlHook.get_connection(conn_id="mssql_local")
hook = conn.get_hook()
df = hook.get_pandas_df(sql="SELECT * FROM [TestDB].[dbo].[ClientData]")
#do whatever you need on the df
print(df)
run_this = PythonOperator(
task_id='mssql_task',
python_callable=mssql_func,
dag=dag
)
错误日志
[2021-01-12 16:07:15,114] {providers_manager.py:159} WARNING - The provider for package 'apache-airflow-providers-imap' could not be registered from because providers for that package name have already been registered
[2021-01-12 16:07:15,618] {base.py:65} INFO - Using connection to: id: mssql_local. Host: localhost, Port: 1433, Schema: dbo, Login: sa, Password: XXXXXXXX, extra: None
[2021-01-12 16:07:15,626] {taskinstance.py:1396} ERROR - (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")
Traceback (most recent call last):
File "src/pymssql.pyx", line 636, in pymssql.connect
File "src/_mssql.pyx", line 1964, in _mssql.connect
File "src/_mssql.pyx", line 682, in _mssql.MSSQLConnection.__init__
File "src/_mssql.pyx", line 1690, in _mssql.maybe_raise_MSSQLDatabaseException
_mssql.MSSQLDatabaseException: (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")
我收到一个错误db failed to connect,但是我在上面脚本中使用的相同连接工作,并且connect{taskinstance.py:1396}错误-18456,用户'sa'的bLogin失败。db Lib错误消息20018,严重性14:\n一般SQL Server错误:检查来自SQL Server的消息\nDB Lib错误消息20002,严重性9:\n适配器服务器连接本地主机失败\n数据库错误消息20002,严重性9:\n适配器服务器连接失败localhost\n回溯最近一次调用上次:@FaraiHuruba see editStill failing与下面相同的错误是来自airflow的整个dag代码。模型从airflow.providers.microsoft.mssql.hooks.mssql从airflow.operators.python\u operator导入PythonOperator从datetime导入datetime、timedelta。。。def mssql\u func**kwargs:hook=MsSqlHookconn\u id='mssql\u local'df=hook.get\u pandas\u dfsql=SELECT*FROM[TestDB].[dbo].[ClientData]在df printdf运行时执行您需要的任何操作\u this=PythonOperator任务\u id='mssql\u task',python\u callable=mssql\u func,dag=dag run_this您用于MSSQLSOperator的导入路径是什么?来自aiffair.providers.microsoft.mssql.operators.mssql导入MSSQLSOperator不如果这是您正在查看的,因为我看到我们在dag上的任何位置都不使用它
[2021-01-12 16:07:15,114] {providers_manager.py:159} WARNING - The provider for package 'apache-airflow-providers-imap' could not be registered from because providers for that package name have already been registered
[2021-01-12 16:07:15,618] {base.py:65} INFO - Using connection to: id: mssql_local. Host: localhost, Port: 1433, Schema: dbo, Login: sa, Password: XXXXXXXX, extra: None
[2021-01-12 16:07:15,626] {taskinstance.py:1396} ERROR - (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")
Traceback (most recent call last):
File "src/pymssql.pyx", line 636, in pymssql.connect
File "src/_mssql.pyx", line 1964, in _mssql.connect
File "src/_mssql.pyx", line 682, in _mssql.MSSQLConnection.__init__
File "src/_mssql.pyx", line 1690, in _mssql.maybe_raise_MSSQLDatabaseException
_mssql.MSSQLDatabaseException: (18456, b"Login failed for user 'sa'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (localhost)\n")