Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/280.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 列车上的Spark MLlib错误_Python_Apache Spark_Pyspark_Apache Spark Mllib_Recommendation Engine - Fatal编程技术网

Python 列车上的Spark MLlib错误

Python 列车上的Spark MLlib错误,python,apache-spark,pyspark,apache-spark-mllib,recommendation-engine,Python,Apache Spark,Pyspark,Apache Spark Mllib,Recommendation Engine,需要帮助!我正在使用Spark MLlib,ALS.trainimplict。当我进行网格搜索时,在大多数情况下,代码工作正常,但在某些参数下,它将停止并显示错误消息。比如: ...... Rank 40, reg 1.0, alpha 2.0, the RMSE = 29.7147495287 Rank 40, reg 1.0, alpha 5.0, the RMSE = 30.1937843479 Traceback (most recent call last): File "/ho

需要帮助!我正在使用Spark MLlib,
ALS.trainimplict
。当我进行网格搜索时,在大多数情况下,代码工作正常,但在某些参数下,它将停止并显示错误消息。比如:

......
Rank 40, reg 1.0, alpha 2.0, the RMSE = 29.7147495287 
Rank 40, reg 1.0, alpha 5.0, the RMSE = 30.1937843479
Traceback (most recent call last):
  File "/home/ubuntu/test/als.py", line 270, in <module>

  File "/home/ubuntu/test/als.py", line 125, in __init__self.models_grid_search()
  File "/home/ubuntu/test/als.py", line 195, in models_grid_search
  model = ALS.trainImplicit(self.trainData, rank, iterations=self.iterations,
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/recommendation.py", line 201, in trainImplicit
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/common.py", line 130, in callMLlibFunc
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/common.py", line 123, in callJavaFunc
  File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
  File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value
  py4j.protocol.Py4JJavaError: An error occurred while calling o1336.trainImplicitALSModel.
  :org.apache.spark.SparkException: Job aborted due to stage failure: Task 17 in stage 22096.0 failed 4 times, 
  most recent failure: Lost task 17.3 in stage 22096.0 (TID 25114, 172.31.11.21): java.lang.AssertionError: 
  assertion failed: lapack.dppsv returned 23.
     at scala.Predef$.assert(Predef.scala:179)
     at org.apache.spark.ml.recommendation.ALS$CholeskySolver.solve(ALS.scala:393)
     at org.apache.spark.ml.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1170)
     at org.apache.spark.ml.recommendation.ALS$$anonfun$org$apache$spark$ml$recommendation$ALS$$computeFactors$1.apply(ALS.scala:1131)
    .....
我应该把
sc.setCheckpointDir('checkpoint/')
ALS.checkpointInterval=2
放在哪里?我应该放在sc后面,还是在火车模型之前?还是在火车模型之后?以下是我的代码(对于后面的错误消息,使用
sc.setCheckpointDir('checkpoint/
):


非常感谢所有专家的解决方案和意见。我的spark版本是
v1.5.2
。同时,如上所示,我正在使用pyspark。

第一个问题是病态矩阵的结果。你真的无能为力(你可以尝试添加一些随机噪声,但这是一个相当丑陋的把戏).关于其他方面…你定义类的方式看起来很可疑(例如,你可以检查),但没有答案,这只是一个猜测。@zero323:非常感谢。你能给我看一下添加随机噪声的任何信息(比如人们谈论过的网站)吗?噪声是数据上的吗?当我添加所有
评级RDD
self.trainData
在我的代码中)按一,即
(用户、产品、视图)->(用户、产品、视图+1)
,这个问题消失了。我尝试了很多方法,比如减少迭代次数和更改种子,但都没有成功。正如你所说,可能是数据结构导致了病态矩阵问题?不是真的,但这个想法与添加1几乎相同,只是使用一些应该具有最小impac的分布t(~N(例如0,0.005))。我认为这是一个好主意。你认为这样做会对数据分析产生一些干扰,还是会对模型精度产生一些不可预测的结果?有些是肯定的,但我们无论如何都会使用近似值,通常接近数值精度限值的噪声就足够了。第一个问题是病态矩阵的结果。不是much你真的可以做些什么(你可以尝试添加一些随机噪声,但这是一个相当丑陋的把戏)。至于其余的…你定义类的方式看起来很可疑(例如,你可以检查),但没有一个标记,这只是一个猜测。@zero323:非常感谢。你能给我看一些信息吗(比如人们谈论过的网站)添加随机噪声的原因?是数据上的噪声吗?当我将所有
评级RDD
我代码中的self.trainData
)添加一个,即
(用户、产品、视图)->(用户、产品、视图+1)
,这个问题消失了。我尝试了很多方法,比如减少迭代次数和更改种子,但都没有成功。正如你所说,可能是数据结构导致了病态矩阵问题?不是真的,但这个想法与添加1几乎相同,只是使用一些应该具有最小impac的分布例如,t(~N(0,0.005)。我认为这是一个好主意。你认为这样做会对数据分析产生一些干扰,还是会对模型精度产生一些不可预测的结果?有些是肯定的,但我们还是使用近似值,通常接近数值精度极限的干扰就足够了。
Traceback (most recent call last):
  File "/home/ubuntu/test/als_implicit.py", line 275, in <module> engine = ImplicitCF(sc, rank=8, seed=5L, iterations=10,reg_parameter=0.06)
  File "/home/ubuntu/test/als_implicit.py", line 129, in __init__self.models_grid_search()
  File "/home/ubuntu/test/als_implicit.py", line 200, in models_grid_search lambda_=reg, blocks=-1, alpha=alphas, nonnegative=False, seed=self.seed)
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/recommendation.py", line 201, in trainImplicit
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/common.py", line 130, in callMLlibFunc
  File "/usr/local/spark/python/lib/pyspark.zip/pyspark/mllib/common.py", line 123, in callJavaFunc
  File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 538, in __call__
  File "/usr/local/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py", line 300, in get_return_value py4j.protocol.Py4JJavaError: 
  An error occurred while calling o116.trainImplicitALSModel.: org.apache.spark.SparkException: Checkpoint RDD ReliableCheckpointRDD[232]       
  at aggregate at ALS.scala:1182(0) has different number of partitions from original RDD itemFactors-10 
  MapPartitionsRDD[230] at mapValues at ALS.scala:1131(18)
      at org.apache.spark.rdd.ReliableRDDCheckpointData.doCheckpoint(ReliableRDDCheckpointData.scala:73)
      at org.apache.spark.rdd.RDDCheckpointData.checkpoint(RDDCheckpointData.scala:74)
      at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply$mcV$sp(RDD.scala:1655)
      at org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1652)
      ....
from __future__ import print_function
import sys
from pyspark.mllib.recommendation import ALS
from pyspark.mllib.recommendation import Rating
from pyspark import SparkContext, SparkConf

class ImplicitCF(object):
   def __init__(self, sc, rank, seed, iterations, reg_parameter):
       text = sc.textFile(sys.argv[1], 1)
       sc.setCheckpointDir('checkpoint/')
       self.sc = sc

       self.rank = rank
       self.seed = seed
       self.iterations = iterations
       self.reg = reg_parameter

       self.models_grid_search(self)

   .......

   def models_grid_search(self):
       for reg in [1.0, 2.0, 5.0]:
          for alphas in [0.1, 0.5, 1.0, 2.0, 5.0]:
            model = ALS.trainImplicit(self.trainData, rank=self.rank, iterations=self.iterations, lambda_=reg, blocks=-1, alpha=alphas, nonnegative=False, seed=self.seed)
            #self.sc.setCheckpointDir('checkpoint/')                                                       
            #ALS.checkpointInterval = 2    

   ....              

if __name__ == "__main__":
   sc = SparkContext(appName="implicit_train_test")
   engine = ImplicitCF(sc, rank=8, seed=5L, iterations=10, reg_parameter=0.06)
   sc.stop()