Apache spark 在Spark 2.4(Hdinsight)中使用Delta Lake源时出错
在下面的错误中,相同的代码在Databricks中工作,但在Hdinsight中不工作。我已经在类路径中添加了delta库和hadoopazure库Apache spark 在Spark 2.4(Hdinsight)中使用Delta Lake源时出错,apache-spark,azure-hdinsight,delta-lake,Apache Spark,Azure Hdinsight,Delta Lake,在下面的错误中,相同的代码在Databricks中工作,但在Hdinsight中不工作。我已经在类路径中添加了delta库和hadoopazure库 io.delta:delta-core_2.11:0.5.0,org.apache.hadoop:hadoop-azure:3.1.3 ERROR ApplicationMaster [Driver]: User class threw exception: com.google.common.util.concurrent.ExecutionE
io.delta:delta-core_2.11:0.5.0,org.apache.hadoop:hadoop-azure:3.1.3
ERROR ApplicationMaster [Driver]: User class threw exception: com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/experimental/ScalaObjectMapper$class
com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/experimental/ScalaObjectMapper$class
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049)
at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4873)
at org.apache.spark.sql.delta.DeltaLog$.apply(DeltaLog.scala:740)
at org.apache.spark.sql.delta.DeltaLog$.forTable(DeltaLog.scala:712)
at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:169)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at io.delta.tables.DeltaTable$.forPath(DeltaTable.scala:635)
与HDInsight打包并由spark、deltalake使用的jackson json库版本之间存在冲突 有两种方法可以解决这个问题
您能检查一下类路径中的
jackson模块scala
的版本吗?看起来您使用的版本不兼容。我使用的是2.11.1 com.fasterxml.jackson.module jackson-module-scala_2.11 2.11.1 test Spark 2.4.6使用的是2.6.7.1(),最好使用相同的版本com/fasterxml/jackson/module/scala/experimental/ScalaObjectMapper
已不在jackson模块scala 2.11.1中。谢谢!!尝试了同样的问题,但仍然是同样的问题。在spark shell中也会出现同样的错误
{"conf":
{"spark.jars.packages": "io.delta:delta-core_2.11:0.5.0",
"spark.driver.extraClassPath":
"${PATH}/jackson-module-scala_2.11-2.6.7.1.jar;${PATH}/jackson-annotations-2.6.7.jar;
${PATH}/jackson-core-2.6.7.jar;
${PATH}/jackson-databind-2.6.7.1.jar;
${PATH}/jackson-module-paranamer-2.6.7.jar",
"spark.executor.extraClassPath":
"${PATH}/jackson-module-scala_2.11-2.6.7.1.jar;${PATH}/jackson-annotations-2.6.7.jar;
${PATH}/jackson-core-2.6.7.jar;${PATH}/jackson-databind-2.6.7.1.jar;
${PATH}/jackson-module-paranamer-2.6.7.jar",
"spark.driver.userClassPathFirst":true}}