Scala ApacheSpark——MlLib——协同过滤
我正在尝试使用MlLib进行协同过滤 在ApacheSpark1.0.0中运行Scala程序时,我遇到了以下错误Scala ApacheSpark——MlLib——协同过滤,scala,apache-spark,apache-spark-mllib,collaborative-filtering,Scala,Apache Spark,Apache Spark Mllib,Collaborative Filtering,我正在尝试使用MlLib进行协同过滤 在ApacheSpark1.0.0中运行Scala程序时,我遇到了以下错误 14/07/15 16:16:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/07/15 16:16:31 WARN LoadSnappy: Snappy n
14/07/15 16:16:31 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/07/15 16:16:31 WARN LoadSnappy: Snappy native library not loaded
14/07/15 16:16:31 INFO FileInputFormat: Total input paths to process : 1
14/07/15 16:16:38 WARN TaskSetManager: Lost TID 10 (task 80.0:0)
14/07/15 16:16:38 WARN TaskSetManager: Loss was due to java.lang.UnsatisfiedLinkError
java.lang.UnsatisfiedLinkError: org.jblas.NativeBlas.dposv(CII[DII[DII)I
at org.jblas.NativeBlas.dposv(Native Method)
at org.jblas.SimpleBlas.posv(SimpleBlas.java:369)
at org.jblas.Solve.solvePositive(Solve.java:68)
at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateBlock$2.apply(ALS.scala:522)
at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateBlock$2.apply(ALS.scala:509)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofInt.foreach(ArrayOps.scala:156)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofInt.map(ArrayOps.scala:156)
at org.apache.spark.mllib.recommendation.ALS.org$apache$spark$mllib$recommendation$ALS$$updateBlock(ALS.scala:509)
at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:445)
at org.apache.spark.mllib.recommendation.ALS$$anonfun$org$apache$spark$mllib$recommendation$ALS$$updateFeatures$2.apply(ALS.scala:444)
at org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.MappedValuesRDD$$anonfun$compute$1.apply(MappedValuesRDD.scala:31)
at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:156)
at org.apache.spark.rdd.CoGroupedRDD$$anonfun$compute$4.apply(CoGroupedRDD.scala:154)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.rdd.CoGroupedRDD.compute(CoGroupedRDD.scala:154)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.rdd.MappedValuesRDD.compute(MappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.rdd.FlatMappedValuesRDD.compute(FlatMappedValuesRDD.scala:31)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:33)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:262)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:229)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:158)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.Task.run(Task.scala:51)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
14/07/15 16:16:38 ERROR TaskSchedulerImpl: Lost executor 0 on maroki.office.mkechinov.ru: Uncaught exception
14/07/15 16:16:38 WARN TaskSetManager: Lost TID 12 (task 80.0:0)
14/07/15 16:16:42 WARN TaskSetManager: Lost TID 18 (task 80.0:1)
14/07/15 16:16:42 WARN TaskSetManager: Loss was due to fetch failure from null
14/07/15 16:16:42 WARN TaskSetManager: Loss was due to fetch failure from null
14/07/15 16:16:43 WARN TaskSetManager: Lost TID 25 (task 80.1:0)
14/07/15 16:16:43 WARN TaskSetManager: Loss was due to java.lang.UnsatisfiedLinkError
如何解决此错误?明确提到MLLib使用本机库,这些库需要存在于节点上。(也就是说,它不附带spark安装)
MLlib使用jblas
线性代数库,它本身依赖于本机Fortran例程。如果节点上尚未安装运行时库,则可能需要安装该库。如果MLlib无法自动检测这些库,它将抛出链接错误
您必须确保所有节点上都存在libgfortran库
对于debian/ubuntu,请使用:
sudo apt get安装libgfortran3
供centos使用:
sudo yum安装gcc gfortran