Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark sbt组件着色以创建在spark上运行的fat jar_Apache Spark_Sbt_Guava_Grpc_Sbt Assembly - Fatal编程技术网

Apache spark sbt组件着色以创建在spark上运行的fat jar

Apache spark sbt组件着色以创建在spark上运行的fat jar,apache-spark,sbt,guava,grpc,sbt-assembly,Apache Spark,Sbt,Guava,Grpc,Sbt Assembly,我正在使用sbt组件创建一个可以在spark上运行的胖罐子。依赖于grpc netty。spark上的番石榴版本比grpc netty要求的版本旧,我遇到了以下错误:。我可以通过在spark上将userClassPathFirst设置为true来解决这个问题,但会导致其他错误 如果我错了,请纠正我,但据我所知,如果我做得正确,我不必将userClassPathFirst设置为true。我现在是这样做的: assemblyShadeRules in assembly := Seq( Shade

我正在使用sbt组件创建一个可以在spark上运行的胖罐子。依赖于
grpc netty
。spark上的番石榴版本比grpc netty要求的版本旧,我遇到了以下错误:。我可以通过在spark上将userClassPathFirst设置为true来解决这个问题,但会导致其他错误

如果我错了,请纠正我,但据我所知,如果我做得正确,我不必将userClassPathFirst设置为true。我现在是这样做的:

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("com.google.guava.**" -> "my_conf.@1")
    .inLibrary("com.google.guava" % "guava" % "20.0")
    .inLibrary("io.grpc" % "grpc-netty" % "1.1.2")
)

libraryDependencies ++= Seq(
  "org.scalaj" %% "scalaj-http" % "2.3.0",
  "org.json4s" %% "json4s-native" % "3.2.11",
  "org.json4s" %% "json4s-jackson" % "3.2.11",
  "org.apache.spark" %% "spark-core" % "2.2.0" % "provided",
  "org.apache.spark" % "spark-sql_2.11" % "2.2.0" % "provided",
  "org.clapper" %% "argot" % "1.0.3",
  "com.typesafe" % "config" % "1.3.1",
  "com.databricks" %% "spark-csv" % "1.5.0",
  "org.apache.spark" % "spark-mllib_2.11" % "2.2.0" % "provided",
  "io.grpc" % "grpc-netty" % "1.1.2",
  "com.google.guava" % "guava" % "20.0"
)

我在这里做错了什么?我该如何修复它?

你就快到了。
shadeRule
所做的是它,而不是库名称:

main ShadeRule.rename规则用于重命名类。对重命名类的所有引用也将更新

事实上,在中没有包含package
com.google.guava
的类:

$ jar tf ~/Downloads/guava-20.0.jar  | sed -e 's:/[^/]*$::' | sort | uniq
META-INF
META-INF/maven
META-INF/maven/com.google.guava
META-INF/maven/com.google.guava/guava
com
com/google
com/google/common
com/google/common/annotations
com/google/common/base
com/google/common/base/internal
com/google/common/cache
com/google/common/collect
com/google/common/escape
com/google/common/eventbus
com/google/common/graph
com/google/common/hash
com/google/common/html
com/google/common/io
com/google/common/math
com/google/common/net
com/google/common/primitives
com/google/common/reflect
com/google/common/util
com/google/common/util/concurrent
com/google/common/xml
com/google/thirdparty
com/google/thirdparty/publicsuffix
将着色规则更改为以下内容就足够了:

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("com.google.common.**" -> "my_conf.@1")
    .inLibrary("com.google.guava" % "guava" % "20.0")
    .inLibrary("io.grpc" % "grpc-netty" % "1.1.2")
)
因此,您不需要更改
userClassPathFirst

此外,您可以简化着色规则,如下所示:

assemblyShadeRules in assembly := Seq(
  ShadeRule.rename("com.google.common.**" -> "my_conf.@1").inAll
)
由于
org.apache.spark
依赖项是
提供的
,因此它们不会包含在您的jar中,也不会被着色(因此spark将使用它在集群上拥有的自己的未着色版本的番石榴)