Apache spark 在MLlib中运行SVMWithSGD算法时出现Java堆空间错误

Apache spark 在MLlib中运行SVMWithSGD算法时出现Java堆空间错误,apache-spark,apache-spark-mllib,Apache Spark,Apache Spark Mllib,我的fnl2数据集的格式如下: scala> fnl2.first() res4: org.apache.spark.mllib.regression.LabeledPoint = (0.0,(612515,[28693,86703,94568,162663,267733,292870,327313,347868,362660,396595,415817,436773,443713,470149,485282,486556,489594,496185,541453,570126,57108

我的
fnl2
数据集的格式如下:

scala> fnl2.first()
res4: org.apache.spark.mllib.regression.LabeledPoint = (0.0,(612515,[28693,86703,94568,162663,267733,292870,327313,347868,362660,396595,415817,436773,443713,470149,485282,486556,489594,496185,541453,570126,571088],[1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0]))

scala> fnl2.count()
res5: Long = 775946
然后,我尝试构建一个
SVMWithSGD
模型:

import org.apache.spark.mllib.classification.SVMWithSGD
import org.apache.spark.mllib.evaluation.BinaryClassificationMetrics

val splits = fnl2.randomSplit(Array(0.6, 0.4), seed = 11L)
val training = splits(0).cache()
val test = splits(1)

val numIterations = 100
val model = SVMWithSGD.train(training, numIterations)
但我得到以下
Java堆大小
错误,然后spark上下文意外关闭:

15/08/10 09:15:41 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
15/08/10 09:15:41 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
15/08/10 09:23:50 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-30] shutting down ActorSystem [sparkDriver]
java.lang.OutOfMemoryError: Java heap space
    at com.google.protobuf_spark.ByteString.toByteArray(ByteString.java:213)
    at akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:24)
    at akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:55)
    at akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:55)
    at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:73)
    at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:764)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/08/10 09:23:56 ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId(eastspark1,57211) not found
org.apache.spark.SparkException: Job cancelled because SparkContext was shut down
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:694)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:693)
    at scala.collection.mutable.HashSet.foreach(HashSet.scala:79)
    at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:693)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessActor.postStop(DAGScheduler.scala:1399)
    at akka.actor.dungeon.FaultHandling$class.akka$actor$dungeon$FaultHandling$$finishTerminate(FaultHandling.scala:201)
    at akka.actor.dungeon.FaultHandling$class.terminate(FaultHandling.scala:163)
    at akka.actor.ActorCell.terminate(ActorCell.scala:338)
    at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:431)
    at akka.actor.ActorCell.systemInvoke(ActorCell.scala:447)
    at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:262)
    at akka.dispatch.Mailbox.run(Mailbox.scala:218)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


scala> 15/08/10 09:23:56 ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId(10.2.0.14,37151) not found
15/08/10 09:23:56 ERROR SendingConnection: Exception while reading SendingConnection to ConnectionManagerId(10.2.0.16,54187)
java.nio.channels.ClosedChannelException
    at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:257)
    at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:300)
    at org.apache.spark.network.SendingConnection.read(Connection.scala:390)
    at org.apache.spark.network.ConnectionManager$$anon$7.run(ConnectionManager.scala:199)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
spark上下文基于
12个核
,每个节点上有
8G内存

有什么想法吗

编辑

这是我在将潜水员的内存增加到5G后得到的错误:export SPARK\u DRIVER\u memory=“5000M”

scala>val model=SVMWithSGD.train(培训、培训)
15/08/10 11:33:07警告BLAS:未能从以下位置加载实现:com.github.fommil.netlib.NativeSystemBLAS
15/08/10 11:33:07警告BLAS:未能从以下位置加载实现:com.github.fommil.netlib.NativeRefBLAS
线程“qtp950243028-158”java.lang.OutOfMemoryError中出现异常:超出GC开销限制
15/08/10 11:46:26错误ActorSystemImpl:线程[sparkDriver akka.actor.default-dispatcher-32]未捕获的致命错误正在关闭ActorSystem[sparkDriver]
java.lang.OutOfMemoryError:java堆空间
位于com.google.protobuf_spark.ByteString.copyFrom(ByteString.java:90)
位于com.google.protobuf_spark.CodedInputStream.readBytes(CodedInputStream.java:289)
在akka.remote.WireFormats$SerializedMessage$Builder.mergeFrom(WireFormats.java:2700)
在akka.remote.WireFormats$SerializedMessage$Builder.mergeFrom(WireFormats.java:2546)
在com.google.protobuf_spark.CodedInputStream.readMessage上(CodedInputStream.java:275)
位于akka.remote.WireFormats$RemoteEnvelope$Builder.mergeFrom(WireFormats.java:1165)
在akka.remote.WireFormats$RemoteEnvelope$Builder.mergeFrom(WireFormats.java:949)
在com.google.protobuf_spark.CodedInputStream.readMessage上(CodedInputStream.java:275)
在akka.remote.WireFormats$AckandInvelopeContainer$Builder.mergeFrom(WireFormats.java:479)
在akka.remote.WireFormats$AckandInvelopeContainer$Builder.mergeFrom(WireFormats.java:300)
位于com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:300)
位于com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
位于com.google.protobuf_spark.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:162)
位于com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:716)
位于com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
位于com.google.protobuf_spark.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:153)
位于com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:709)
在akka.remote.WireFormats$AckandInvelopeContainer.parseFrom(WireFormats.java:234)
位于akka.remote.transport.AkkaPduProtobufCodec$.decodeMessage(AkkaPduCodec.scala:181)
在akka.remote.EndpointReader.akka$remote$EndpointReader$$tryDecodeMessageAndAck(Endpoint.scala:821)
在akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:755)
在akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
在akka.actor.ActorCell.invoke(ActorCell.scala:456)
位于akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
在akka.dispatch.Mailbox.run(Mailbox.scala:219)
在akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
位于scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
位于scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
位于scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
在scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)中
15/08/10 11:46:45警告AbstractNioWorker:选择器循环中出现意外异常。
java.lang.OutOfMemoryError:java堆空间
位于org.jboss.netty.buffer.HeapChannelBuffer.(HeapChannelBuffer.java:42)
位于org.jboss.netty.buffer.BigEndianHeapChannelBuffer.(BigEndianHeapChannelBuffer.java:34)
位于org.jboss.netty.buffer.ChannelBuffers.buffer(ChannelBuffers.java:134)
位于org.jboss.netty.buffer.HeapChannelBufferFactory.getBuffer(HeapChannelBufferFactory.java:69)
位于org.jboss.netty.buffer.AbstractChannelBufferFactory.getBuffer(AbstractChannelBufferFactory.java:48)
位于org.jboss.netty.channel.socket.nio.niower.read(niower.java:75)
位于org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:472)
位于org.jboss.netty.channel.socket.nio.AbstractNiower.run(AbstractNiower.java:333)
位于org.jboss.netty.channel.socket.nio.niower.run(niower.java:35)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
运行(Thread.java:745)

您为JVM分配了多少内存?我建议读一下这个。默认的512M可能对您来说不够。请使用5G内存重新运行它
export SPARK\u DRIVER\u MEMORY=“5000M”
我仍然面临问题,但问题现在略有不同。请查看我的编辑。问题可能是您的数据太大,无法保存在内存中。您可以尝试在较小的样本量上计算SVM模型。Wierd,数据集看起来并没有那么大,但它正在排除超出的GC开销。一定要缩小fn12的尺寸,看看它是否有效。这也可能是由于您调用SVM的方式导致的,而实际上它是有效的。我没有使用
export SPARK\u DRIVER\u MEMORY=“5000M”
,而是在SPARK配置级别增加了驱动程序的内存。您为JVM分配了多少内存?我建议读一下这个。默认的512M可能对您来说不够。请使用5G内存重新运行它
export SPARK\u DRIVER\u MEMORY=“5000M”
我仍然面临问题,但问题现在略有不同。请查看我的编辑。问题可能是您的数据太大,无法保存
scala> val model = SVMWithSGD.train(training, numIterations)
15/08/10 11:33:07 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
15/08/10 11:33:07 WARN BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
Exception in thread "qtp950243028-158" java.lang.OutOfMemoryError: GC overhead limit exceeded
15/08/10 11:46:26 ERROR ActorSystemImpl: Uncaught fatal error from thread [sparkDriver-akka.actor.default-dispatcher-32] shutting down ActorSystem [sparkDriver]
java.lang.OutOfMemoryError: Java heap space
    at com.google.protobuf_spark.ByteString.copyFrom(ByteString.java:90)
    at com.google.protobuf_spark.CodedInputStream.readBytes(CodedInputStream.java:289)
    at akka.remote.WireFormats$SerializedMessage$Builder.mergeFrom(WireFormats.java:2700)
    at akka.remote.WireFormats$SerializedMessage$Builder.mergeFrom(WireFormats.java:2546)
    at com.google.protobuf_spark.CodedInputStream.readMessage(CodedInputStream.java:275)
    at akka.remote.WireFormats$RemoteEnvelope$Builder.mergeFrom(WireFormats.java:1165)
    at akka.remote.WireFormats$RemoteEnvelope$Builder.mergeFrom(WireFormats.java:949)
    at com.google.protobuf_spark.CodedInputStream.readMessage(CodedInputStream.java:275)
    at akka.remote.WireFormats$AckAndEnvelopeContainer$Builder.mergeFrom(WireFormats.java:479)
    at akka.remote.WireFormats$AckAndEnvelopeContainer$Builder.mergeFrom(WireFormats.java:300)
    at com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:300)
    at com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
    at com.google.protobuf_spark.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:162)
    at com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:716)
    at com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:238)
    at com.google.protobuf_spark.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:153)
    at com.google.protobuf_spark.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:709)
    at akka.remote.WireFormats$AckAndEnvelopeContainer.parseFrom(WireFormats.java:234)
    at akka.remote.transport.AkkaPduProtobufCodec$.decodeMessage(AkkaPduCodec.scala:181)
    at akka.remote.EndpointReader.akka$remote$EndpointReader$$tryDecodeMessageAndAck(Endpoint.scala:821)
    at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:755)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
    at akka.actor.ActorCell.invoke(ActorCell.scala:456)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
    at akka.dispatch.Mailbox.run(Mailbox.scala:219)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/08/10 11:46:45 WARN AbstractNioWorker: Unexpected exception in the selector loop.
java.lang.OutOfMemoryError: Java heap space
    at org.jboss.netty.buffer.HeapChannelBuffer.<init>(HeapChannelBuffer.java:42)
    at org.jboss.netty.buffer.BigEndianHeapChannelBuffer.<init>(BigEndianHeapChannelBuffer.java:34)
    at org.jboss.netty.buffer.ChannelBuffers.buffer(ChannelBuffers.java:134)
    at org.jboss.netty.buffer.HeapChannelBufferFactory.getBuffer(HeapChannelBufferFactory.java:69)
    at org.jboss.netty.buffer.AbstractChannelBufferFactory.getBuffer(AbstractChannelBufferFactory.java:48)
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:75)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.processSelectedKeys(AbstractNioWorker.java:472)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:333)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:35)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)