Scala 无法使用独立群集运行spark作业

Scala 无法使用独立群集运行spark作业,scala,docker,apache-spark,apache-spark-standalone,Scala,Docker,Apache Spark,Apache Spark Standalone,TL;博士: 如何修复java.lang.IllegalStateException:找不到任何生成目录。在独立群集中提交spark作业时出错 我将Spark应用程序打包在docker映像中,并使用。 这将生成一个包含所有必需JAR的映像: docker run --rm -it --entrypoint ls myimage:latest -l lib total 199464 [...] -r--r--r-- 1 demiourgos728 root 3354982 Oct 2 20

TL;博士:

如何修复
java.lang.IllegalStateException:找不到任何生成目录。
在独立群集中提交spark作业时出错


我将Spark应用程序打包在docker映像中,并使用。 这将生成一个包含所有必需JAR的映像:

docker run --rm -it --entrypoint ls myimage:latest -l lib
total 199464
[...]
-r--r--r-- 1 demiourgos728 root  3354982 Oct  2  2016 org.apache.hadoop.hadoop-common-2.6.5.jar
[...]
-r--r--r-- 1 demiourgos728 root  8667550 Sep  8  2020 org.apache.spark.spark-core_2.12-2.4.7.jar
[...]
-r--r--r-- 1 demiourgos728 root  5276900 Sep 10  2019 org.scala-lang.scala-library-2.12.10.jar
[...]
然后,我使用docker compose设置了一个独立集群:

version: '3'

services:
  spark-driver:
    image: myimage:latest
    ports:
      - "8080:8080"
    command: [
        "-main", "org.apache.spark.deploy.master.Master"
    ]

  spark-worker:
    image: myimage:latest
    ports:
      - "8081:8081"
    depends_on:
      - spark-driver
    command: [
        "-main", "org.apache.spark.deploy.worker.Worker",
        "spark-driver:7077",
        "--work-dir", "/tmp/spark_work"
    ]

  app:
    image: myimage:latest
    ports:
      - "4040:4040"
    environment:
      SPARK_HOME: "/opt/docker"
    depends_on:
      - spark-worker
    command: [
         "-main", "org.apache.spark.deploy.SparkSubmit",
         "--master", "spark://spark-driver:7077",
         "--class", "io.dummy.MyClass",
         "/opt/docker/lib/io.dummy.mypackage.jar"
    ]
运行
spark驱动程序
和一些
spark工作者
s工作(工作者注册到驱动程序等)。 但是,在启动我的应用程序时,它会不断失败,并出现此类错误:

[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor added: app-20210511122036-0003/0 on worker-20210511114945-172.23.0.3-46727 (172.23.0.3:46727) with 8 core(s)
[o.a.s.s.c.StandaloneSchedulerBackend] Granted executor ID app-20210511122036-0003/0 on hostPort 172.23.0.3:46727 with 8 core(s), 1024.0 MB RAM
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor added: app-20210511122036-0003/1 on worker-20210511114945-172.23.0.4-40043 (172.23.0.4:40043) with 8 core(s)
[o.a.s.s.c.StandaloneSchedulerBackend] Granted executor ID app-20210511122036-0003/1 on hostPort 172.23.0.4:40043 with 8 core(s), 1024.0 MB RAM
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor updated: app-20210511122036-0003/0 is now RUNNING
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor updated: app-20210511122036-0003/1 is now RUNNING
[o.a.s.s.BlockManagerMaster] Registering BlockManager BlockManagerId(driver, e484e8deb590, 41285, None)
[o.a.s.d.c.StandaloneAppClient$ClientEndpoint] Executor updated: app-20210511122036-0003/0 is now FAILED (java.lang.IllegalStateException: Cannot find any build directories.)
[o.a.s.s.c.StandaloneSchedulerBackend] Executor app-20210511122036-0003/0 removed: java.lang.IllegalStateException: Cannot find any build directories.
[o.a.s.s.BlockManagerMasterEndpoint] Registering block manager e484e8deb590:41285 with 1917.3 MB RAM, BlockManagerId(driver, e484e8deb590, 41285, None)
[o.a.s.s.BlockManagerMaster] Removal of executor 0 requested
相关部分似乎是:
java.lang.IllegalStateException:找不到任何生成目录。
从不同的SO帖子来看,它似乎与
SPARK\u HOME
环境变量或
scala
库版本不匹配有关

然而:

  • 我尝试了不同的
    SPARK\u HOME
    值(无、/tmp、/opt/docker),但没有任何改变
  • 关于scala,映像中没有安装scala二进制文件,但是类路径中有scala库jar
发生了什么事?如何解决这个问题