Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Java 如何构建真正的本地Apache Spark“;fat“;罐子JRE内存问题?_Java_Scala_Apache Spark_Sbt_Sbt Assembly - Fatal编程技术网

Java 如何构建真正的本地Apache Spark“;fat“;罐子JRE内存问题?

Java 如何构建真正的本地Apache Spark“;fat“;罐子JRE内存问题?,java,scala,apache-spark,sbt,sbt-assembly,Java,Scala,Apache Spark,Sbt,Sbt Assembly,长话短说:我有一个应用程序,使用Spark数据帧和机器学习,前端使用ScalaFX我想创建一个巨大的“胖”jar,这样它就可以在任何有JVM的机器上运行。 我熟悉组装sbt插件,已经研究了几个小时组装jar的方法。下面是我的build.sbt: lazy val root = (project in file(".")). settings( scalaVersion := "2.11.8", mainClass in assembly := Some("me.projects.My

长话短说:我有一个应用程序,使用Spark数据帧和机器学习,前端使用ScalaFX我想创建一个巨大的“胖”jar,这样它就可以在任何有JVM的机器上运行。

我熟悉组装sbt插件,已经研究了几个小时组装jar的方法。下面是我的build.sbt:

lazy val root = (project in file(".")).
  settings(
  scalaVersion := "2.11.8",
  mainClass in assembly := Some("me.projects.MyProject.Main"),
  assemblyJarName in assembly := "MyProject_2.0.jar",
  test in assembly := {}
  )

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0" withSources() withJavadoc()
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" withSources() withJavadoc()
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.0.2" withSources() withJavadoc()
libraryDependencies += "joda-time" % "joda-time" % "2.9.4" withJavadoc()
libraryDependencies += "org.scalactic" %% "scalactic" % "3.0.1" % "provided"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1" % "test"
libraryDependencies += "org.scalafx" %% "scalafx" % "8.0.92-R10" withSources() withJavadoc()
libraryDependencies += "net.liftweb" %% "lift-json" % "2.6+" withSources() withJavadoc()

EclipseKeys.withSource := true
EclipseKeys.withJavadoc := true

// META-INF discarding
assemblyMergeStrategy in assembly := {
  case PathList("org","aopalliance", xs @ _*) => MergeStrategy.last
  case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
  case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
  case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
  case PathList("org", "apache", xs @ _*) => MergeStrategy.last
  case PathList("com", "google", xs @ _*) => MergeStrategy.last
  case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
  case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
  case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
  case "about.html" => MergeStrategy.rename
  case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
  case "META-INF/mailcap" => MergeStrategy.last
  case "META-INF/mimetypes.default" => MergeStrategy.last
  case "plugin.properties" => MergeStrategy.last
  case "log4j.properties" => MergeStrategy.last
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}
这在安装并配置了spark的Linux机器上运行良好。在我使用ScalaFX组装罐子并在Windows机器上打开之前,没有任何问题。但是,该应用程序也使用Spark,提供以下功能:

ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200. Please increase the heap size using the --driver-memory option or spark.driver.memory in Spark configuration.
我尝试过的事情:

  • 在构建sbt的spark依赖项中包括/不包括%“已提供”
  • 要在-Xms、运行时参数和Windows的机器Java运行时环境设置中添加越来越大的数字
  • 要在创建SparkConf(在scala代码中)时为spark.executor.driver/memory设置不同的值,请执行以下操作:

    .set(“spark.executor.memory”,“12g”) .set(“spark.executor.driver”、“5g”) .set(“火花、驱动器、内存”、“5g”)

否则,应用程序工作正常(在Scala IDE中运行时,使用spark submit运行时,在linux中打开组装好的jar时)

如果可能的话,请告诉我。这是一个小项目,它使用GUI(ScalaFX)在一些数据(Spark)上运行一些机器学习操作。因此,存在上述依赖关系


再说一遍,我不想建立集群或类似的东西。我只想通过在任何带有JRE的计算机上运行jar就可以访问Spark功能。这是一个要展示的小项目。

在声明SparkConf时,请尝试使用
.set(“Spark.driver.memory”,“5g”)
。当然,如果您在机器5+g上有高于内存的内存。

结果表明,这是一个相当普遍的JVM问题。我不只是添加运行时参数,而是通过向Windows系统添加一个新的环境变量来解决此问题:

名称:
\u JAVA\u选项

值:
-Xms512M-Xmx1024M

我尝试了这个,但仍然得到相同的错误。机器有8GB RAM您提到您只有8g RAM,然后将executor memory设置为5g,driver memory设置为2g,并将1g设置为OST感谢您的意见,在尝试不同的executor memory和driver memory值(包括您的建议)后,错误仍然存在。显然这是一个JVM问题