Java Apache Spark在主机不可用时停止JVM

Java Apache Spark在主机不可用时停止JVM,java,apache-spark,systemexit,Java,Apache Spark,Systemexit,在我的应用程序中,Java spark上下文是使用一个不可用的主URL创建的(您可能会假设主URL已停止维护)。当创建JavaSpark上下文时,它会导致停止运行spark驱动程序的JVM,JVM退出代码为50 当我检查日志时,发现SparkUncaughtExceptionHandler正在调用System.exit。我的程序应该永远运行。我应该如何克服这个问题 我在spark版本1.4.1和1.6.0中尝试了这个场景 我的代码如下 package test.mains; import or

在我的应用程序中,Java spark上下文是使用一个不可用的主URL创建的(您可能会假设主URL已停止维护)。当创建JavaSpark上下文时,它会导致停止运行spark驱动程序的JVM,JVM退出代码为50

当我检查日志时,发现SparkUncaughtExceptionHandler正在调用System.exit。我的程序应该永远运行。我应该如何克服这个问题

我在spark版本1.4.1和1.6.0中尝试了这个场景

我的代码如下

package test.mains;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;

public class CheckJavaSparkContext {

    /**
     * @param args the command line arguments
     */
    public static void main(String[] args) {

        SparkConf conf = new SparkConf();
        conf.setAppName("test");
        conf.setMaster("spark://sunshine:7077");

        try {
            new JavaSparkContext(conf);
        } catch (Throwable e) {
            System.out.println("Caught an exception : " + e.getMessage());
            //e.printStackTrace();
        }

        System.out.println("Waiting to complete...");
        while (true) {
        }
    }

}
输出日志的一部分

16/03/04 18:02:24 INFO SparkDeploySchedulerBackend: Shutting down all executors
16/03/04 18:02:24 INFO SparkDeploySchedulerBackend: Asking each executor to shut down
16/03/04 18:02:24 WARN AppClient$ClientEndpoint: Drop UnregisterApplication(null) because has not yet connected to master
16/03/04 18:02:24 ERROR SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[appclient-registration-retry-thread,5,main]
java.lang.InterruptedException
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1039)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
    at scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:208)
    at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
    at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
    at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
    at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
    at scala.concurrent.Await$.result(package.scala:107)
    at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
    at org.apache.spark.deploy.client.AppClient.stop(AppClient.scala:290)
    at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.org$apache$spark$scheduler$cluster$SparkDeploySchedulerBackend$$stop(SparkDeploySchedulerBackend.scala:198)
    at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.stop(SparkDeploySchedulerBackend.scala:101)
    at org.apache.spark.scheduler.TaskSchedulerImpl.stop(TaskSchedulerImpl.scala:446)
    at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1582)
    at org.apache.spark.SparkContext$$anonfun$stop$7.apply$mcV$sp(SparkContext.scala:1731)
    at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1229)
    at org.apache.spark.SparkContext.stop(SparkContext.scala:1730)
    at org.apache.spark.scheduler.cluster.SparkDeploySchedulerBackend.dead(SparkDeploySchedulerBackend.scala:127)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint.markDead(AppClient.scala:264)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2$$anonfun$run$1.apply$mcV$sp(AppClient.scala:134)
    at org.apache.spark.util.Utils$.tryOrExit(Utils.scala:1163)
    at org.apache.spark.deploy.client.AppClient$ClientEndpoint$$anon$2.run(AppClient.scala:129)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
16/03/04 18:02:24 INFO DiskBlockManager: Shutdown hook called
16/03/04 18:02:24 INFO ShutdownHookManager: Shutdown hook called
16/03/04 18:02:24 INFO ShutdownHookManager: Deleting directory /tmp/spark-ea68a0fa-4f0d-4dbb-8407-cce90ef78a52
16/03/04 18:02:24 INFO ShutdownHookManager: Deleting directory /tmp/spark-ea68a0fa-4f0d-4dbb-8407-cce90ef78a52/userFiles-db548748-a55c-4406-adcb-c09e63b118bd
Java Result: 50

如果应用程序主机关闭,应用程序本身将尝试使用连接到主机。看起来这些参数是硬编码的,不可配置。若应用程序无法连接,那个么在应用程序再次启动后,您只能尝试重新提交应用程序

这就是为什么您应该在高可用性模式下配置集群。Spark Standalone支持两种不同的模式:


第二个选项应适用于生产,并在所述场景中有用。

调用System.exit(退出代码50)的原因是什么?在第三方API代码中执行这样的操作是不可接受的。当我检查SparkUncaughtExceptionHandler.scala代码时,它有一个坏的JVM kill代码。