Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 从spark中的稀疏向量创建标签点_Scala_Apache Spark_Apache Spark Sql_Rdd_Apache Spark Mllib - Fatal编程技术网

Scala 从spark中的稀疏向量创建标签点

Scala 从spark中的稀疏向量创建标签点,scala,apache-spark,apache-spark-sql,rdd,apache-spark-mllib,Scala,Apache Spark,Apache Spark Sql,Rdd,Apache Spark Mllib,我使用VectorAssembler在spark/scala中的数据帧中创建了一个特征向量。到目前为止一切正常。现在我想从标签和稀疏向量创建标签点 val labeledPoints = featureDf.map{r=> val label = r(0).toString.toDouble + r(1).toString.toDouble + r(2).toString.toDouble val features = r(r.size-1) LabeledPoint(labe

我使用VectorAssembler在spark/scala中的数据帧中创建了一个特征向量。到目前为止一切正常。现在我想从标签和稀疏向量创建标签点

val labeledPoints = featureDf.map{r=>
  val label = r(0).toString.toDouble + r(1).toString.toDouble + r(2).toString.toDouble
  val features = r(r.size-1)
  LabeledPoint(label, Vectors.sparse(features))

}
但这不起作用。我得到一个编译错误。错误是:

overloaded method value sparse with alternatives:
(size: Int,elements: Iterable[(Integer,java.lang.Double)])org.apache.spark.mllib.linalg.Vector
<and>
(size: Int,elements: Seq[(Int, scala.Double)])org.apache.spark.mllib.linalg.Vector
<and>
(size: Int,indices: Array[Int],values:Array[scala.Double])org.apache.spark.mllib.linalg.Vector
cannot be applied to (Any)
重载的方法值稀疏,带有替代项:
(大小:Int,元素:Iterable[(Integer,java.lang.Double)]org.apache.spark.mllib.linalg.Vector
(大小:Int,元素:Seq[(Int,scala.Double)]org.apache.spark.mllib.linalg.Vector
(大小:Int,索引:Array[Int],值:Array[scala.Double])org.apache.spark.mllib.linalg.Vector
无法应用于(任何)
我已经尝试使用
val features=r(r.size-1)来强制转换向量。asInstanceOf[vector]
等等,但没有任何效果。有人知道如何解决这个问题吗


提前谢谢

这里需要的是
行。getAs
方法:

val features = r.getAs[org.apache.spark.mllib.linalg.SparseVector](r.size - 1)
它还支持按名称提取,因此假设您的列名为
features

r.getAs[org.apache.spark.mllib.linalg.SparseVector]("features")