Sql server 针对SQL Server表的spark.Read读取错误(通过JDBC连接)

Sql server 针对SQL Server表的spark.Read读取错误(通过JDBC连接),sql-server,apache-spark,apache-zeppelin,Sql Server,Apache Spark,Apache Zeppelin,当我试图创建一个直接从SQL表读取的数据帧时,齐柏林飞艇出现了一个问题。问题是我不知道如何读取地理类型的SQL列 这是我正在使用的代码,也是我得到的错误 创建JDBC连接 import org.apache.spark.sql.SaveMode import java.util.Properties val jdbcHostname = "XX.XX.XX.XX" val jdbcDatabase = "databasename" val jdbcUsername = "user" val

当我试图创建一个直接从SQL表读取的数据帧时,齐柏林飞艇出现了一个问题。问题是我不知道如何读取地理类型的SQL列

这是我正在使用的代码,也是我得到的错误

创建JDBC连接

import org.apache.spark.sql.SaveMode
import java.util.Properties

val jdbcHostname = "XX.XX.XX.XX"
val jdbcDatabase = "databasename"
val jdbcUsername = "user"
val jdbcPassword = "XXXXXXXX"

// Create the JDBC URL without passing in the user and password parameters.
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname};database=${jdbcDatabase}"

// Create a Properties() object to hold the parameters.
val connectionProperties = new Properties()
connectionProperties.put("user", s"${jdbcUsername}")
connectionProperties.put("password", s"${jdbcPassword}")
connectionProperties.setProperty("Driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver")
从SQL读取

import spark.implicits._

val table = "tablename"

val postcode_polygons = spark.
    read.
    jdbc(jdbcUrl, table, connectionProperties)
错误

import spark.implicits._
table: String = Lookup.Postcode50m_Lookup
java.sql.SQLException: Unsupported type -158
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:233)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:290)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:290)
  at scala.Option.getOrElse(Option.scala:121)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:289)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:114)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:52)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
  at org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:193)
导入spark.implicits_
表:String=Lookup.postcode\u Lookup
java.sql.SQLException:不支持的类型-158
位于org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:233)
位于org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:290)
位于org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$8.apply(JdbcUtils.scala:290)
位于scala.Option.getOrElse(Option.scala:121)
位于org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:289)
位于org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbcreation.(jdbcreation.scala:114)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbrelationprovider.createRelation(jdbrelationprovider.scala:52)
位于org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146)
位于org.apache.spark.sql.DataFrameReader.jdbc(DataFrameReader.scala:193)

添加到BluePhantom答案时,您是否尝试将类型更改为字符串,如下所示并加载表格

val jdbcDF = spark.read.format("jdbc")
  .option("dbtable" -> "(select toString(SData) as s_sdata,toString(CentroidSData) as s_centroidSdata from table) t")
  .option("user", "user_name")
  .option("other options")
  .load()

这是我的最终解决方案,moasifk的想法是正确的,但在我的代码中,我不能使用函数“toString”。我用了同样的想法,但用了另一种方法

import spark.implicits._

val tablename = "Lookup.Postcode50m_Lookup"

val postcode_polygons = spark.
    read.
    jdbc(jdbcUrl, table=s"(select PostcodeNoSpaces, cast(SData as nvarchar(4000)) as SData from $tablename) as postcode_table", connectionProperties)

不支持的列类型或无效的表类型,等等?我以前在阅读其他表时使用过代码,因此,我认为问题不在于表或代码,而在于具有地理类型的列。我认为您已经回答了自己的问题