阶段13.0(TID 13)中任务0.0出现异常java.lang.OutOfMemoryError:java堆空间

阶段13.0(TID 13)中任务0.0出现异常java.lang.OutOfMemoryError:java堆空间,java,hadoop,apache-spark,mahout,mahout-recommender,Java,Hadoop,Apache Spark,Mahout,Mahout Recommender,当我们使用“mahout spark rowsimilarity”操作时,我们正在试验一些问题。我们有一个包含100k行和100项的输入矩阵,进程抛出一个关于“13.0阶段(TID 13)java.lang.OutOfMemoryError:java堆空间中任务0.0中的异常”的异常我们试图增加JAVA堆内存、MAHOUT堆内存和spark.driver.MEMORY 环境版本: 收银员:0.11.1 火花:1.6.0 Mahout命令行: /opt/mahout/bin/mahout spar

当我们使用“mahout spark rowsimilarity”操作时,我们正在试验一些问题。我们有一个包含100k行和100项的输入矩阵,进程抛出一个关于“13.0阶段(TID 13)java.lang.OutOfMemoryError:java堆空间中任务0.0中的异常”的异常我们试图增加JAVA堆内存、MAHOUT堆内存和spark.driver.MEMORY

环境版本: 收银员:0.11.1 火花:1.6.0

Mahout命令行:

/opt/mahout/bin/mahout spark-rowsimilarity -i 50k_rows__50items.dat -o test_output.tmp --maxObservations 500 --maxSimilaritiesPerRow 100 --omitStrength --master local --sparkExecutorMem 8g
此过程在具有以下规格的机器上运行:

Mem RAM: 8gb 
CPU with 8 cores
.profile文件:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export HADOOP_HOME=/opt/hadoop-2.6.0
export SPARK_HOME=/opt/spark
export MAHOUT_HOME=/opt/mahout
export MAHOUT_HEAPSIZE=8192
引发异常:

16/01/22 11:45:06 ERROR Executor: Exception in task 0.0 in stage 13.0 (TID 13)
java.lang.OutOfMemoryError: Java heap space
        at org.apache.mahout.math.DenseMatrix.<init>(DenseMatrix.java:66)
        at org.apache.mahout.sparkbindings.drm.package$$anonfun$blockify$1.apply(package.scala:70)
        at org.apache.mahout.sparkbindings.drm.package$$anonfun$blockify$1.apply(package.scala:59)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
        at org.apache.spark.scheduler.Task.run(Task.scala:89)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
16/01/22 11:45:06 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@12498227,BlockManagerId(driver, localhost, 42107))] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:448)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:468)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
16/01/22 11:45:06 WARN NettyRpcEndpointRef: Error sending message [message = Heartbeat(driver,[Lscala.Tuple2;@12498227,BlockManagerId(driver, localhost, 42107))] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [120 seconds]. This timeout is controlled by spark.rpc.askTimeout
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:448)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:468)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:468)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [120 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:107)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
        ...
16/01/22 11:45:06错误执行者:13.0阶段(TID 13)任务0.0中的异常
java.lang.OutOfMemoryError:java堆空间
位于org.apache.mahout.math.DenseMatrix(DenseMatrix.java:66)
位于org.apache.mahout.sparkbindings.drm.package$$anonfun$blockify$1.apply(package.scala:70)
位于org.apache.mahout.sparkbindings.drm.package$$anonfun$blockify$1.apply(package.scala:59)
在org.apache.spark.rdd.rdd$$anonfun$mapPartitions$1$$anonfun$apply$20.apply上(rdd.scala:710)
在org.apache.spark.rdd.rdd$$anonfun$mapPartitions$1$$anonfun$apply$20.apply上(rdd.scala:710)
在org.apache.spark.rdd.MapPartitionsRDD.compute上(MapPartitionsRDD.scala:38)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:306)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:270)
在org.apache.spark.rdd.MapPartitionsRDD.compute上(MapPartitionsRDD.scala:38)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:306)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:270)
在org.apache.spark.rdd.MapPartitionsRDD.compute上(MapPartitionsRDD.scala:38)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:306)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:270)
在org.apache.spark.rdd.MapPartitionsRDD.compute上(MapPartitionsRDD.scala:38)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:306)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:270)
在org.apache.spark.rdd.MapPartitionsRDD.compute上(MapPartitionsRDD.scala:38)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:306)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:270)
位于org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
位于org.apache.spark.scheduler.Task.run(Task.scala:89)
位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:213)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
运行(Thread.java:745)
16/01/22 11:45:06警告NettyRpcEndpointRef:发送消息时出错[message=Heartbeat(驱动程序,[Lscala.Tuple2;@12498227,BlockManagerId(驱动程序,本地主机,42107))]需要1次尝试
org.apache.spark.rpc.RpcTimeoutException:期货在[120秒]后超时。此超时由spark.rpc.askTimeout控制
在org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
在org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIftTimeout$1.applyOrElse(RpcTimeout.scala:63)
在org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIftTimeout$1.applyOrElse(RpcTimeout.scala:59)
在scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)中
位于org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
位于org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
位于org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
在org.apache.spark.executor.executor.org$apache$spark$executor$executor$$reportHeartBeat(executor.scala:448)
在org.apache.spark.executor.executor$$anon$1$$anonfun$run$1.apply$mcV$sp(executor.scala:468)
位于org.apache.spark.executor.executor$$anon$1$$anonfun$run$1.apply(executor.scala:468)
位于org.apache.spark.executor.executor$$anon$1$$anonfun$run$1.apply(executor.scala:468)
位于org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1741)
位于org.apache.spark.executor.executor$$anon$1.run(executor.scala:468)
位于java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
位于java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
位于java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
位于java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
运行(Thread.java:745)
16/01/22 11:45:06警告NettyRpcEndpointRef:发送消息时出错[message=Heartbeat(驱动程序,[Lscala.Tuple2;@12498227,BlockManagerId(驱动程序,本地主机,42107))]需要1次尝试
org.apache.spark.rpc.RpcTimeoutException:期货在[120秒]后超时。此超时由spark.rpc.askTimeout控制
在org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
在org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIftTimeout$1.applyOrElse(RpcTimeout.scala:63)
在org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIftTimeout$1.applyOrElse(RpcTimeout.scala:59)
在scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)中
位于org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:76)
位于org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:101)
位于org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:77)
在org.apac