Apache spark 为什么在“工人”的日志中没有stacktrace;OutOfMemoryError:Java堆空间“;?
我有一个spark作业,由于GC\Heap space错误而失败。当我检查终端时,我可以看到堆栈跟踪:Apache spark 为什么在“工人”的日志中没有stacktrace;OutOfMemoryError:Java堆空间“;?,apache-spark,apache-spark-sql,Apache Spark,Apache Spark Sql,我有一个spark作业,由于GC\Heap space错误而失败。当我检查终端时,我可以看到堆栈跟踪: Caused by: org.spark_project.guava.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: Java heap space at org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2261) a
Caused by: org.spark_project.guava.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: Java heap space
at org.spark_project.guava.cache.LocalCache$Segment.get(LocalCache.java:2261)
at org.spark_project.guava.cache.LocalCache.get(LocalCache.java:4000)
at org.spark_project.guava.cache.LocalCache.getOrLoad(LocalCache.java:4004)
at org.spark_project.guava.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4874)
at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.compile(CodeGenerator.scala:890)
at org.apache.spark.sql.execution.WholeStageCodegenExec.doExecute(WholeStageCodegenExec.scala:357)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:114)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:135)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:132)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:113)
at org.apache.spark.sql.execution.exchange.ShuffleExchange.prepareShuffleDependency(ShuffleExchange.scala:85)
at org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:121)
at org.apache.spark.sql.execution.exchange.ShuffleExchange$$anonfun$doExecute$1.apply(ShuffleExchange.scala:112)
at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:52)
... 77 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.resize(HashMap.java:703)
at java.util.HashMap.putVal(HashMap.java:628)
at java.util.HashMap.putMapEntries(HashMap.java:514)
at java.util.HashMap.putAll(HashMap.java:784)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3073)
at org.codehaus.janino.UnitCompiler.access$4900(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:2958)
at org.codehaus.janino.UnitCompiler$8.visitLocalVariableDeclarationStatement(UnitCompiler.java:2926)
at org.codehaus.janino.Java$LocalVariableDeclarationStatement.accept(Java.java:2974)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:3033)
at org.codehaus.janino.UnitCompiler.access$4400(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$8.visitSwitchStatement(UnitCompiler.java:2950)
at org.codehaus.janino.UnitCompiler$8.visitSwitchStatement(UnitCompiler.java:2926)
at org.codehaus.janino.Java$SwitchStatement.accept(Java.java:2866)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2982)
at org.codehaus.janino.UnitCompiler.access$3800(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2944)
at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2926)
at org.codehaus.janino.Java$Block.accept(Java.java:2471)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2999)
at org.codehaus.janino.UnitCompiler.access$4000(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$8.visitForStatement(UnitCompiler.java:2946)
at org.codehaus.janino.UnitCompiler$8.visitForStatement(UnitCompiler.java:2926)
at org.codehaus.janino.Java$ForStatement.accept(Java.java:2660)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2925)
at org.codehaus.janino.UnitCompiler.buildLocalVariableMap(UnitCompiler.java:2982)
at org.codehaus.janino.UnitCompiler.access$3800(UnitCompiler.java:206)
at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2944)
at org.codehaus.janino.UnitCompiler$8.visitBlock(UnitCompiler.java:2926)
问题是stacktrace没有出现在我使用webUI或直接在磁盘上的文件检查的任何Worker日志(stdout和stderr)上
我在应用程序上有一个失败的执行器,它只显示(stdout):
17:12:17008错误[TransportResponseHandler]当来自/:35482的连接关闭时,仍有1个请求未完成
17:12:17010错误[GrossGrainedExecutorBackend]执行器自退出原因:驱动程序:35482已解除关联!关闭。
stderr文件为空
这对我来说是一个大问题,因为我并不总是在控制台中看到整个日志/堆栈跟踪,我寻找可靠/持久的模式。
org.codehaus.janino
包用于整个阶段的Java代码生成(参见stacktrace中带有org.apache.spark.sql.execution.whistagecodegenexec.doExecute
的行),这是作为查询优化的一部分在驱动程序上发生的(在RDD准备好执行之前)
问题是stacktrace没有出现在我使用webUI或直接在磁盘上的文件检查的任何Worker日志(stdout和stderr)上
任何工作人员日志中都不应该有stacktrace,因为没有提交任何内容供执行人员(以及工作人员)执行然而,它在执行者让它执行之前就失败了。在代码生成阶段有可能出现Java堆空间错误吗?你到底期望什么?还有什么会告诉你驱动程序的JVM内存不足?我用4gb内存运行驱动程序,我只是问这是否可能充满了代码?啊,你想看看编译失败的代码?请使用
q.queryExecution.debug.codegen
其中q
是您要探索的查询。但是,您不应该这样做,因为它是非常内部的。您可能在Spark中遇到了问题。Spark的版本是什么?您可以查看最新的2.2.1吗?我在2.1.0上运行了它,更改版本将不会很简单。.我将现在检查调试选项
17:12:17,008 ERROR [TransportResponseHandler] Still have 1 requests outstanding when connection from /<IP1>:35482 is closed
17:12:17,010 ERROR [CoarseGrainedExecutorBackend] Executor self-exiting due to : Driver <IP1>:35482 disassociated! Shutting down.