Sql server Spark SQL JDBC空数据帧
我正在尝试将查询结果从一个表加载到另一个表。它连接良好并执行查询以获取元数据,但不返回任何数据Sql server Spark SQL JDBC空数据帧,sql-server,apache-spark,jdbc,pyspark,Sql Server,Apache Spark,Jdbc,Pyspark,我正在尝试将查询结果从一个表加载到另一个表。它连接良好并执行查询以获取元数据,但不返回任何数据 from pyspark.sql import SQLContext, Row, SparkSession spark = SparkSession.builder.config("spark.driver.extraClassPath", "C:\\spark\SQL\\sqljdbc_7.0\\enu\\mssql-jdbc-7.0.0.jre10.jar").getOrCreate() SQ
from pyspark.sql import SQLContext, Row, SparkSession
spark = SparkSession.builder.config("spark.driver.extraClassPath", "C:\\spark\SQL\\sqljdbc_7.0\\enu\\mssql-jdbc-7.0.0.jre10.jar").getOrCreate()
SQL = "Select [InvoiceID],[CustomerID],[BillToCustomerID],[OrderID],[DeliveryMethodID],[ContactPersonID],[AccountsPersonID],[SalespersonPersonID],[PackedByPersonID],[InvoiceDate],[CustomerPurchaseOrderNumber],[IsCreditNote],[CreditNoteReason],[Comments],[DeliveryInstructions],[InternalComments],[TotalDryItems],[TotalChillerItems],[DeliveryRun],[RunPosition],[ReturnedDeliveryData],[ConfirmedDeliveryTime],[ConfirmedReceivedBy],[LastEditedBy],[LastEditedWhen] FROM [Sales].[Invoices]"
pgDF = spark.read \
.format("jdbc") \
.option("url", "jdbc:sqlserver://Localhost") \
.option("query", SQL) \
.option("user", "dp_admin") \
.option("Database", "WideWorldImporters") \
.option("password", "password") \
.option("fetchsize", 1000) \
.load(SQL)
pgDF.write \
.format("jdbc") \
.option("url", "jdbc:sqlserver://Localhost") \
.option("dbtable", "wwi.Sales_InvoiceLines") \
.option("user", "dp_admin") \
.option("Database", "DW_Staging") \
.option("password", "password") \
.option("mode", "overwrite")
查看sql server探查器:
exec sp_executesql N'SELECT * FROM (Select [InvoiceID],[CustomerID],[BillToCustomerID],[OrderID],[DeliveryMethodID],[ContactPersonID],[AccountsPersonID],[SalespersonPersonID],[PackedByPersonID],[InvoiceDate],[CustomerPurchaseOrderNumber],[IsCreditNote],[CreditNoteReason],[Comments],[DeliveryInstructions],[InternalComments],[TotalDryItems],[TotalChillerItems],[DeliveryRun],[RunPosition],[ReturnedDeliveryData],[ConfirmedDeliveryTime],[ConfirmedReceivedBy],[LastEditedBy],[LastEditedWhen] FROM [Sales].[Invoices]) __SPARK_GEN_JDBC_SUBQUERY_NAME_0 WHERE 1=0'
添加了where 1=0,但未返回任何数据,为什么以及如何删除它?Hi,您可以尝试不使用fetchsize选项吗?删除fetchsize选项不会更改结果-仍然是where子句。显然where子句用于spark从表中读取元数据并将数据类型映射到数据帧。但是我仍然不知道为什么数据帧没有被填充。你试过从SSMS中选择吗?我不知道spark,但通过一步一步地删除代码,您可以看到哪里出了问题。尝试从查询中执行select 1、'test'。您的表不是o模式(dbo)或其他?我认为您必须加载数据表,然后执行查询。请尝试将.option('dbtable',query)替换为.option('query',query)。看看这个