Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark Spark MatrixFactoryModel在Recommends Products Forser调用时崩溃_Apache Spark_Recommendation Engine_Matrix Factorization - Fatal编程技术网

Apache spark Spark MatrixFactoryModel在Recommends Products Forser调用时崩溃

Apache spark Spark MatrixFactoryModel在Recommends Products Forser调用时崩溃,apache-spark,recommendation-engine,matrix-factorization,Apache Spark,Recommendation Engine,Matrix Factorization,我在Apache Spark 1.6.0中有一个隐式MatrixFactorizationModel,超过300万用户和3万项。现在,我想用如下代码计算所有用户的前10个推荐项目: val model = MatrixFactorizationModel.load(sc, "/hdfs/path/to/model") model.userFeatures.cache model.productFeatures.cache val recommendations: RDD[(Int, Array[

我在Apache Spark 1.6.0中有一个隐式MatrixFactorizationModel,超过300万用户和3万项。现在,我想用如下代码计算所有用户的前10个推荐项目:

val model = MatrixFactorizationModel.load(sc, "/hdfs/path/to/model")
model.userFeatures.cache
model.productFeatures.cache
val recommendations: RDD[(Int, Array[Rating])] = model.recommendProductsForUsers(10)
不幸的是,这会导致计算崩溃,并出现以下错误:

WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Container marked as failed: container_xxx_xxxxxxxxx_xxxx_xx_xxxxx on host: xxxxx.xxxx.xxx. Exit status: 1. Diagnostics: Exception from container-launch.
Container id: container_xxx_xxxxx_xxxx_xx_xxxxxxx
Exit code: 1
Stack trace: ExitCodeException exitCode=1:
        at org.apache.hadoop.util.Shell.runCommand(Shell.java:576)
        at org.apache.hadoop.util.Shell.run(Shell.java:487)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:753)
        at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:212)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:303)
        at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)


Container exited with a non-zero exit code 1
原因是什么?如何计算建议

Spark在54节点群集上运行,我使用以下命令启动REPL:

spark-shell  \
--master yarn \
--driver-memory 16g \
--executor-memory 16G \
--num-executors 32 \
--executor-cores 8
用户和项目因子都位于缓存的RDD中,具有504个分区