Apache spark sbt组件着色以创建在spark上运行的fat jar
我正在使用sbt组件创建一个可以在spark上运行的胖罐子。依赖于Apache spark sbt组件着色以创建在spark上运行的fat jar,apache-spark,sbt,guava,grpc,sbt-assembly,Apache Spark,Sbt,Guava,Grpc,Sbt Assembly,我正在使用sbt组件创建一个可以在spark上运行的胖罐子。依赖于grpc netty。spark上的番石榴版本比grpc netty要求的版本旧,我遇到了以下错误:。我可以通过在spark上将userClassPathFirst设置为true来解决这个问题,但会导致其他错误 如果我错了,请纠正我,但据我所知,如果我做得正确,我不必将userClassPathFirst设置为true。我现在是这样做的: assemblyShadeRules in assembly := Seq( Shade
grpc netty
。spark上的番石榴版本比grpc netty要求的版本旧,我遇到了以下错误:。我可以通过在spark上将userClassPathFirst设置为true来解决这个问题,但会导致其他错误
如果我错了,请纠正我,但据我所知,如果我做得正确,我不必将userClassPathFirst设置为true。我现在是这样做的:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.guava.**" -> "my_conf.@1")
.inLibrary("com.google.guava" % "guava" % "20.0")
.inLibrary("io.grpc" % "grpc-netty" % "1.1.2")
)
libraryDependencies ++= Seq(
"org.scalaj" %% "scalaj-http" % "2.3.0",
"org.json4s" %% "json4s-native" % "3.2.11",
"org.json4s" %% "json4s-jackson" % "3.2.11",
"org.apache.spark" %% "spark-core" % "2.2.0" % "provided",
"org.apache.spark" % "spark-sql_2.11" % "2.2.0" % "provided",
"org.clapper" %% "argot" % "1.0.3",
"com.typesafe" % "config" % "1.3.1",
"com.databricks" %% "spark-csv" % "1.5.0",
"org.apache.spark" % "spark-mllib_2.11" % "2.2.0" % "provided",
"io.grpc" % "grpc-netty" % "1.1.2",
"com.google.guava" % "guava" % "20.0"
)
我在这里做错了什么?我该如何修复它?你就快到了。
shadeRule
所做的是它,而不是库名称:
main ShadeRule.rename规则用于重命名类。对重命名类的所有引用也将更新
事实上,在中没有包含packagecom.google.guava
的类:
$ jar tf ~/Downloads/guava-20.0.jar | sed -e 's:/[^/]*$::' | sort | uniq
META-INF
META-INF/maven
META-INF/maven/com.google.guava
META-INF/maven/com.google.guava/guava
com
com/google
com/google/common
com/google/common/annotations
com/google/common/base
com/google/common/base/internal
com/google/common/cache
com/google/common/collect
com/google/common/escape
com/google/common/eventbus
com/google/common/graph
com/google/common/hash
com/google/common/html
com/google/common/io
com/google/common/math
com/google/common/net
com/google/common/primitives
com/google/common/reflect
com/google/common/util
com/google/common/util/concurrent
com/google/common/xml
com/google/thirdparty
com/google/thirdparty/publicsuffix
将着色规则更改为以下内容就足够了:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.common.**" -> "my_conf.@1")
.inLibrary("com.google.guava" % "guava" % "20.0")
.inLibrary("io.grpc" % "grpc-netty" % "1.1.2")
)
因此,您不需要更改userClassPathFirst
此外,您可以简化着色规则,如下所示:
assemblyShadeRules in assembly := Seq(
ShadeRule.rename("com.google.common.**" -> "my_conf.@1").inAll
)
由于org.apache.spark
依赖项是提供的
,因此它们不会包含在您的jar中,也不会被着色(因此spark将使用它在集群上拥有的自己的未着色版本的番石榴)