Sql server Spark SQL JDBC空数据帧

Sql server Spark SQL JDBC空数据帧,sql-server,apache-spark,jdbc,pyspark,Sql Server,Apache Spark,Jdbc,Pyspark,我正在尝试将查询结果从一个表加载到另一个表。它连接良好并执行查询以获取元数据,但不返回任何数据 from pyspark.sql import SQLContext, Row, SparkSession spark = SparkSession.builder.config("spark.driver.extraClassPath", "C:\\spark\SQL\\sqljdbc_7.0\\enu\\mssql-jdbc-7.0.0.jre10.jar").getOrCreate() SQ

我正在尝试将查询结果从一个表加载到另一个表。它连接良好并执行查询以获取元数据,但不返回任何数据

from pyspark.sql import SQLContext, Row, SparkSession

spark = SparkSession.builder.config("spark.driver.extraClassPath", "C:\\spark\SQL\\sqljdbc_7.0\\enu\\mssql-jdbc-7.0.0.jre10.jar").getOrCreate()

SQL = "Select [InvoiceID],[CustomerID],[BillToCustomerID],[OrderID],[DeliveryMethodID],[ContactPersonID],[AccountsPersonID],[SalespersonPersonID],[PackedByPersonID],[InvoiceDate],[CustomerPurchaseOrderNumber],[IsCreditNote],[CreditNoteReason],[Comments],[DeliveryInstructions],[InternalComments],[TotalDryItems],[TotalChillerItems],[DeliveryRun],[RunPosition],[ReturnedDeliveryData],[ConfirmedDeliveryTime],[ConfirmedReceivedBy],[LastEditedBy],[LastEditedWhen] FROM [Sales].[Invoices]"

pgDF = spark.read \
    .format("jdbc") \
    .option("url", "jdbc:sqlserver://Localhost") \
    .option("query", SQL) \
    .option("user", "dp_admin") \
    .option("Database", "WideWorldImporters") \
    .option("password", "password") \
    .option("fetchsize", 1000) \
    .load(SQL)

pgDF.write \
.format("jdbc") \
.option("url", "jdbc:sqlserver://Localhost") \
.option("dbtable", "wwi.Sales_InvoiceLines") \
.option("user", "dp_admin") \
.option("Database", "DW_Staging") \
.option("password", "password") \
.option("mode", "overwrite")
查看sql server探查器:

exec sp_executesql N'SELECT * FROM (Select [InvoiceID],[CustomerID],[BillToCustomerID],[OrderID],[DeliveryMethodID],[ContactPersonID],[AccountsPersonID],[SalespersonPersonID],[PackedByPersonID],[InvoiceDate],[CustomerPurchaseOrderNumber],[IsCreditNote],[CreditNoteReason],[Comments],[DeliveryInstructions],[InternalComments],[TotalDryItems],[TotalChillerItems],[DeliveryRun],[RunPosition],[ReturnedDeliveryData],[ConfirmedDeliveryTime],[ConfirmedReceivedBy],[LastEditedBy],[LastEditedWhen] FROM [Sales].[Invoices]) __SPARK_GEN_JDBC_SUBQUERY_NAME_0 WHERE 1=0'

添加了where 1=0,但未返回任何数据,为什么以及如何删除它?

Hi,您可以尝试不使用fetchsize选项吗?删除fetchsize选项不会更改结果-仍然是where子句。显然where子句用于spark从表中读取元数据并将数据类型映射到数据帧。但是我仍然不知道为什么数据帧没有被填充。你试过从SSMS中选择吗?我不知道spark,但通过一步一步地删除代码,您可以看到哪里出了问题。尝试从查询中执行select 1、'test'。您的表不是o模式(dbo)或其他?我认为您必须加载数据表,然后执行查询。请尝试将.option('dbtable',query)替换为.option('query',query)。看看这个