Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark 为什么可以';在Spark KMeans算法上设置ε=1e-4?_Apache Spark_Cluster Analysis_K Means_Apache Spark Mllib - Fatal编程技术网

Apache spark 为什么可以';在Spark KMeans算法上设置ε=1e-4?

Apache spark 为什么可以';在Spark KMeans算法上设置ε=1e-4?,apache-spark,cluster-analysis,k-means,apache-spark-mllib,Apache Spark,Cluster Analysis,K Means,Apache Spark Mllib,我想通过设置epsilon=1e-4而不是设置numIterations来训练Spark上的K-means模型。在spark shell中,我输入: val model = KMeans.train(trainRDD, numClusters=8, runs=30, initializationMode="k-means||",epsilon=1e-4) 但是错误,错误信息如下: scala> val model = KMeans.train(trainRDD, numClusters=

我想通过设置
epsilon=1e-4
而不是设置
numIterations
来训练Spark上的K-means模型。在spark shell中,我输入:

val model = KMeans.train(trainRDD, numClusters=8, runs=30, initializationMode="k-means||",epsilon=1e-4)
但是错误,错误信息如下:

scala> val model = KMeans.train(trainRDD, numClusters=8, runs=30, initializationMode="k-means||",epsilon=1e-4)
<console>:48: error: overloaded method value train with alternatives:
  (data: org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector],k: Int,maxIterations: Int,runs: Int)org.apache.spark.mllib.clustering.KMeansModel <and>
  (data: org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector],k: Int,maxIterations: Int)org.apache.spark.mllib.clustering.KMeansModel <and>
  (data: org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector],k: Int,maxIterations: Int,runs: Int,initializationMode: String)org.apache.spark.mllib.clustering.KMeansModel <and>
  (data: org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector],k: Int,maxIterations: Int,runs: Int,initializationMode: String,seed: Long)org.apache.spark.mllib.clustering.KMeansModel
 cannot be applied to (org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector], numClusters: Int, runs: Int, initializationMode: String, epsilon: Double)
       val model = KMeans.train(trainRDD, numClusters=8, runs=30, initializationMode="k-means||",epsilon=1e-4)
                          ^
scala>val model=KMeans.train(trainRDD,numClusters=8,runs=30,initializationMode=“k-means | |”,epsilon=1e-4)
:48:错误:重载方法值序列和备选方案:
(数据:org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.Vector],k:Int,maxIterations:Int,runs:Int)org.apache.spark.mllib.clustering.KMeansModel
(数据:org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.Vector],k:Int,maxIterations:Int)org.apache.spark.mllib.clustering.KMeansModel
(数据:org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.Vector],k:Int,maxIterations:Int,runs:Int,initializationMode:String)org.apache.spark.mllib.clustering.KMeansModel
(数据:org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.Vector],k:Int,maxIterations:Int,runs:Int,initializationMode:String,seed:Long)org.apache.spark.mllib.clustering.KMeansModel
无法应用于(org.apache.spark.rdd.rdd[org.apache.spark.mllib.linalg.Vector],numClusters:Int,runs:Int,initializationMode:String,epsilon:Double)
val model=KMeans.train(trainRDD,numClusters=8,runs=30,initializationMode=“k-means | | |”,epsilon=1e-4)
^

我该怎么办

没有定义此类
列车
方法

使用实构造函数,并根据需要设置参数

请参阅文档:


然后使用
setEpsilon
设置提前终止阈值。

尝试setEpsilon方法如何操作@croxy@kiseliu怎么做什么?