Apache spark 火花作业崩溃,ExitCodeException exitCode=15

Apache spark 火花作业崩溃,ExitCodeException exitCode=15,apache-spark,spark-dataframe,Apache Spark,Spark Dataframe,我正在运行一个很长的spark作业,该作业会因以下错误而崩溃 Application application_1456200816465_347125 failed 2 times due to AM Container for appattempt_1456200816465_347125_000002 exited with exitCode: 15 For more detailed output, check application tracking page:http://foo.co

我正在运行一个很长的spark作业,该作业会因以下错误而崩溃

Application application_1456200816465_347125 failed 2 times due to AM Container for appattempt_1456200816465_347125_000002 exited with exitCode: 15
For more detailed output, check application tracking page:http://foo.com:8088/proxy/application_1456200816465_347125/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e24_1456200816465_347125_02_000001
Exit code: 15
Stack trace: ExitCodeException exitCode=15:
at org.apache.hadoop.util.Shell.runCommand(Shell.java:538)
at org.apache.hadoop.util.Shell.run(Shell.java:455)
at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302)
at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Container exited with a non-zero exit code 15
Failing this attempt. Failing the application.
我单击上面错误消息中提供的链接,这显示了我

java.io.IOException: Target log file already exists (hdfs://nameservice1/user/spark/applicationHistory/application_1456200816465_347125)
    at org.apache.spark.scheduler.EventLoggingListener.stop(EventLoggingListener.scala:201)
    at org.apache.spark.SparkContext$$anonfun$stop$5.apply(SparkContext.scala:1394)
    at org.apache.spark.SparkContext$$anonfun$stop$5.apply(SparkContext.scala:1394)
    at scala.Option.foreach(Option.scala:236)
    at org.apache.spark.SparkContext.stop(SparkContext.scala:1394)
    at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:107)
    at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
如果我重新启动作业,它可以正常工作1小时左右,然后再次失败,并出现此错误。请注意
hdfs://nameservice1/user/spark/applicationHistory/application_1456200816465_347125
是一些系统生成的东西。此文件夹与我的应用程序无关

我在互联网上搜索,很多人都犯了这个错误,因为他们在代码中将主机设置为本地。这就是我初始化spark上下文的方式

val conf = new SparkConf().setAppName("Foo")
val context = new SparkContext(conf)
context.hadoopConfiguration.set("mapreduce.input.fileinputformat.input.dir.recursive","true")
val sc = new SQLContext(context)
我的星火工作就像

sudo -u web nohup spark-submit --class com.abhi.Foo--master yarn-cluster 
Foo-assembly-1.0.jar "2015-03-18" "2015-03-30" > fn_output.txt 2> fn_error.txt &