Apache spark 从Spark连接到sql数据库

Apache spark 从Spark连接到sql数据库,apache-spark,Apache Spark,我正在尝试从spark连接到SQL数据库,我使用了以下命令: scala> import org.apache.spark.sql.SQLContext impo

我正在尝试从spark连接到SQL数据库,我使用了以下命令:

scala> import org.apache.spark.sql.SQLContext                                                                                                                                                                      
import org.apache.spark.sql.SQLContext

scala> val sqlcontext = new org.apache.spark.sql.SQLContext(sc)                                                                                                                                                    
warning: there was one deprecation warning; re-run with -deprecation for details
sqlcontext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@2bf4fa1

scala> val dataframe_mysql = sqlcontext.read.format("jdbc").option("url", "jdbc:sqlserver:192.168.103.64/DRE").option("driver", "com.microsoft.sqlserver.jdbc.SQLServerDriver").option("dbtable", "NCentralAlerts")
.option("user", "sqoop").option("password", "hadoop").load()
java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver
  at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
  at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
  at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
  at scala.Option.foreach(Option.scala:257)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:79)
  at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
  at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)
  at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
  at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
  at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
  ... 49 elided
scala>import org.apache.spark.sql.SQLContext
导入org.apache.spark.sql.SQLContext
scala>val sqlcontext=new org.apache.spark.sql.sqlcontext(sc)
警告:有一个弃用警告;有关详细信息,请使用-deprecation重新运行
sqlcontext:org.apache.spark.sql.sqlcontext=org.apache.spark.sql。SQLContext@2bf4fa1
scala>val dataframe_mysql=sqlcontext.read.format(“jdbc”).option(“url”、“jdbc:sqlserver:192.168.103.64/DRE”).option(“driver”、“com.microsoft.sqlserver.jdbc.SQLServerDriver”).option(“dbtable”、“ncentrallerts”)
.option(“user”,“sqoop”).option(“password”,“hadoop”).load()
java.lang.ClassNotFoundException:com.microsoft.sqlserver.jdbc.SQLServerDriver
位于scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
位于java.lang.ClassLoader.loadClass(ClassLoader.java:424)
位于java.lang.ClassLoader.loadClass(ClassLoader.java:357)
位于org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
位于org.apache.spark.sql.execution.datasources.jdbc.jdboptions$$anonfun$6.apply(jdboptions.scala:79)
位于org.apache.spark.sql.execution.datasources.jdbc.jdboptions$$anonfun$6.apply(jdboptions.scala:79)
位于scala.Option.foreach(Option.scala:257)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbcopies。(jdbcopies.scala:79)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbcopies。(jdbcopies.scala:35)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbrelationprovider.createRelation(jdbrelationprovider.scala:34)
位于org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
位于org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
位于org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
... 49删去

我看到Spark正在寻找SQL驱动程序。我应该将此SQL驱动程序放在哪个目录中?

我从日志中看到,您正试图使用spark shell运行此驱动程序。假设你手边有罐子。使用以下添加项启动火花壳

spark-shell --jars /path/to/driver.jar
这样,它将被添加到类路径中,并且您将能够使用驱动程序

希望这有帮助