Apache spark Spark Streaming应用程序在运行24小时后提供OOM

Apache spark Spark Streaming应用程序在运行24小时后提供OOM,apache-spark,garbage-collection,spark-streaming,spark-dataframe,Apache Spark,Garbage Collection,Spark Streaming,Spark Dataframe,我正在使用spark 1.5.0,正在开发spark流媒体应用程序。应用程序从HDFS读取文件,将rdd转换为数据帧,并对每个数据帧执行多个查询 该应用程序可以完美运行24小时左右,然后崩溃。 应用程序主日志/驱动程序日志显示: Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: GC overhead limit exceeded at java.lang.Class.getDeclaredMet

我正在使用spark 1.5.0,正在开发spark流媒体应用程序。应用程序从HDFS读取文件,将rdd转换为数据帧,并对每个数据帧执行多个查询

该应用程序可以完美运行24小时左右,然后崩溃。 应用程序主日志/驱动程序日志显示:

Exception in thread "dag-scheduler-event-loop" java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
at java.lang.Class.getDeclaredMethod(Class.java:2128)
at java.io.ObjectStreamClass.getInheritableMethod(ObjectStreamClass.java:1442)
at java.io.ObjectStreamClass.access$2200(ObjectStreamClass.java:72)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:508)
at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.ObjectStreamClass.<init>(ObjectStreamClass.java:472)
at java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369)
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1134)
at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
    at scala.collection.immutable.$colon$colon.writeObject(List.scala:379)
    at sun.reflect.GeneratedMethodAccessor1511.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
    at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
Exception in thread "JobGenerator" java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.zip.ZipCoder.getBytes(ZipCoder.java:80)
    at java.util.zip.ZipFile.getEntry(ZipFile.java:310)
    at java.util.jar.JarFile.getEntry(JarFile.java:240)
    at sun.net.www.protocol.jar.URLJarFile.getEntry(URLJarFile.java:128)
    at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132)
    at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:150)
    at java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:238)
    at java.lang.Class.getResourceAsStream(Class.java:2223)
    at org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:38)
    at org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:81)
    at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187)
    at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
    at org.apache.spark.SparkContext.clean(SparkContext.scala:2032)
    at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:314)
    at org.apache.spark.rdd.RDD$$anonfun$map$1.apply(RDD.scala:313)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
    at org.apache.spark.rdd.RDD.map(RDD.scala:313)
    at org.apache.spark.streaming.dstream.MappedDStream$$anonfun$compute$1.apply(MappedDStream.scala:35)
    at org.apache.spark.streaming.dstream.MappedDStream$$anonfun$compute$1.apply(MappedDStream.scala:35)
    at scala.Option.map(Option.scala:145)
    at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35)
    at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:350)
    at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:350)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)
    at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:349)
    at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:349)
    at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:399)
    at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:344)
    at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:342)
    at scala.Option.orElse(Option.scala:257)
线程“dag调度程序事件循环”java.lang.OutOfMemoryError中出现异常:超出GC开销限制 位于java.lang.Class.getDeclaredMethods0(本机方法) 位于java.lang.Class.privateGetDeclaredMethods(Class.java:2701) 位于java.lang.Class.getDeclaredMethod(Class.java:2128) 位于java.io.ObjectStreamClass.getInheritableMethod(ObjectStreamClass.java:1442) 在java.io.ObjectStreamClass.access$2200(ObjectStreamClass.java:72) 在java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:508) 在java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472) 位于java.security.AccessController.doPrivileged(本机方法) 位于java.io.ObjectStreamClass。(ObjectStreamClass.java:472) 位于java.io.ObjectStreamClass.lookup(ObjectStreamClass.java:369) 位于java.io.ObjectOutputStream.WriteObject 0(ObjectOutputStream.java:1134) 位于java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) 位于java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) 位于java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) 位于java.io.ObjectOutputStream.WriteObject 0(ObjectOutputStream.java:1178) 位于java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) 位于java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) 位于java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) 位于java.io.ObjectOutputStream.WriteObject 0(ObjectOutputStream.java:1178) 位于java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) 位于scala.collection.immutable.$colon$colon.writeObject(List.scala:379) 位于sun.reflect.GeneratedMethodAccessor1511.invoke(未知源) 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中 位于java.lang.reflect.Method.invoke(Method.java:497) 位于java.io.ObjectStreamClass.invokeWriteObject(ObjectStreamClass.java:1028) 位于java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1496) 位于java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) 位于java.io.ObjectOutputStream.WriteObject 0(ObjectOutputStream.java:1178) 位于java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) 位于java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) 位于java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) 位于java.io.ObjectOutputStream.WriteObject 0(ObjectOutputStream.java:1178) 线程“JobGenerator”java.lang.OutOfMemoryError中出现异常:超出GC开销限制 位于java.util.zip.ZipCoder.getBytes(ZipCoder.java:80) 位于java.util.zip.ZipFile.getEntry(ZipFile.java:310) 位于java.util.jar.JarFile.getEntry(JarFile.java:240) 位于sun.net.www.protocol.jar.URLJarFile.getEntry(URLJarFile.java:128) 位于sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:132) 位于sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:150) 位于java.net.URLClassLoader.getResourceAsStream(URLClassLoader.java:238) 位于java.lang.Class.getResourceAsStream(Class.java:2223) 位于org.apache.spark.util.ClosureCleaner$.getClassReader(ClosureCleaner.scala:38) 在org.apache.spark.util.ClosureCleaner$.getInnerClosureClasses(ClosureCleaner.scala:81)上 位于org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:187) 位于org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122) 位于org.apache.spark.SparkContext.clean(SparkContext.scala:2032) 位于org.apache.spark.rdd.rdd$$anonfun$map$1.apply(rdd.scala:314) 位于org.apache.spark.rdd.rdd$$anonfun$map$1.apply(rdd.scala:313) 位于org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147) 位于org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108) 位于org.apache.spark.rdd.rdd.withScope(rdd.scala:306) 位于org.apache.spark.rdd.rdd.map(rdd.scala:313) 在org.apache.spark.streaming.dstream.MappedDStream$$anonfun$compute$1.apply上(MappedDStream.scala:35) 在org.apache.spark.streaming.dstream.MappedDStream$$anonfun$compute$1.apply上(MappedDStream.scala:35) 位于scala.Option.map(Option.scala:145) 位于org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) 在org.apache.spark.streaming.dstream.dstream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(dstream.scala:350) 在org.apache.spark.streaming.dstream.dstream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(dstream.scala:350) 在scala.util.DynamicVariable.withValue(DynamicVariable.scala:57)中 位于org.apache.spark.streaming.dstream.dstream$$anonfun$getOrCompute$1$$anonfun$1.apply(dstream.scala:349) 位于org.apache.spark.streaming.dstream.dstream$$anonfun$getOrCompute$1$$anonfun$1.apply(dstream.scala:349) 位于org.apache.spark.streaming.dstream.dstream.createRDDWithLocalProperties(dstream.scala:399) 位于org.apache.spark.streaming.dstream.dstream$$anonfun$getOrCompute$1.apply(dstream.scala:344) 位于org.apache.spark.streaming.dstream.dstream$$anonfun$getOrCompute$1.apply(dstream.scala:342) 在scala.Option.orElse(Option.scala:257) 我收集了驱动程序堆转储,它说可能的内存泄漏来自
org.apache.spark.sql.execution.ui.SQLListener

此外,在我的应用程序主url中,我可以看到数千个
SQL选项卡,例如:->SQL1、SQL2。。SQL 2000
和这些选项卡的数量不断增加

有人知道为什么这些SQL选项卡不断增加,并建议GC异常吗。 谢谢

有一些m