Postgresql 如何使用maven在spark中包含JDBCJAR
我有一份spark(2.1.0)工作,它使用postgres jdbc驱动程序,如下所述: 我使用数据帧编写器,就像Postgresql 如何使用maven在spark中包含JDBCJAR,postgresql,scala,apache-spark,jdbc,sbt,Postgresql,Scala,Apache Spark,Jdbc,Sbt,我有一份spark(2.1.0)工作,它使用postgres jdbc驱动程序,如下所述: 我使用数据帧编写器,就像 val jdbcURL = s"jdbc:postgresql://${config.pgHost}:${config.pgPort}/${config.pgDatabase}?user=${config.pgUser}&password=${config.pgPassword}" val connectionProperties = new Properties() c
val jdbcURL = s"jdbc:postgresql://${config.pgHost}:${config.pgPort}/${config.pgDatabase}?user=${config.pgUser}&password=${config.pgPassword}"
val connectionProperties = new Properties()
connectionProperties.put("user", config.pgUser)
connectionProperties.put("password", config.pgPassword)
dataFrame.write.mode(SaveMode.Overwrite).jdbc(jdbcURL, tableName, connectionProperties)
通过手动下载jdbc驱动程序并使用--jars-postgresql-42.1.1.jar--驱动程序类路径postgresql-42.1.1.jar,我成功地将其包括在内
但是,我不想先下载它
我尝试了--jars,但没有成功https://jdbc.postgresql.org/download/postgresql-42.1.1.jar
,但从
Exception in thread "main" java.io.IOException: No FileSystem for scheme: http
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2584)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:91)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2630)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2612)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:370)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:364)
at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:480)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$11$$anonfun$apply$8.apply(Client.scala:600)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$11$$anonfun$apply$8.apply(Client.scala:599)
at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$11.apply(Client.scala:599)
at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$11.apply(Client.scala:598)
at scala.collection.immutable.List.foreach(List.scala:381)
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:598)
at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:868)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:170)
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1154)
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1213)
at org.apache.spark.deploy.yarn.Client.main(Client.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我也尝试过:
在我的build.sbt
文件中包括“org.postgresql”%“postgresql”%“42.1.1”
spark提交
选项:--存储库https://mvnrepository.com/artifact --packagesorg.postgresql:postgresql:42.1.1
spark提交
选项:--存储库https://mvnrepository.com/artifact --conf“spark.jars.packages=org.postgresql:postgresql:42.1.1
每一个都以相同的方式失败:
17/08/01 13:14:49 ERROR yarn.ApplicationMaster: User class threw exception: java.sql.SQLException: No suitable driver
java.sql.SQLException: No suitable driver
at java.sql.DriverManager.getDriver(DriverManager.java:315)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$7.apply(JDBCOptions.scala:84)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:83)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:34)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53)
at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:426)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215)
at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:446)
17/08/01 13:14:49错误。应用程序管理员:用户类引发异常:java.sql.SQLException:没有合适的驱动程序
java.sql.SQLException:没有合适的驱动程序
位于java.sql.DriverManager.getDriver(DriverManager.java:315)
位于org.apache.spark.sql.execution.datasources.jdbc.jdboptions$$anonfun$7.apply(jdboptions.scala:84)
位于org.apache.spark.sql.execution.datasources.jdbc.jdboptions$$anonfun$7.apply(jdboptions.scala:84)
位于scala.Option.getOrElse(Option.scala:121)
位于org.apache.spark.sql.execution.datasources.jdbc.jdboptions.(jdboptions.scala:83)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbcopies。(jdbcopies.scala:34)
位于org.apache.spark.sql.execution.datasources.jdbc.jdbrelationprovider.createRelation(jdbrelationprovider.scala:53)
位于org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:426)
位于org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:215)
位于org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:446)
指定驱动程序选项,就像使用JDBC类指定用户和密码一样。指定驱动程序选项,就像使用JDBC类指定用户和密码一样。您可以将JDBC jar文件复制到spark
目录中的jars
文件夹中,并使用spark submit
部署应用程序,而不使用--jars
选项。您可以将JDBC jar文件复制到spark
目录下的jars
文件夹中,并使用spark submit
而不使用--jars
选项部署您的应用程序。很酷,这很管用。我仍然希望它在项目中独立存在(这样我就可以分发我的jar并运行命令,而不需要安装程序),但这比我所做的要好。很酷,这是可行的。我仍然希望它在项目中独立存在(这样我就可以分发我的jar并运行命令,而不需要安装),但这比我所做的要好。