Scala 强制转换错误:将ArrayType(DoubleType,true)强制转换为DoubleType

Scala 强制转换错误:将ArrayType(DoubleType,true)强制转换为DoubleType,scala,apache-spark,Scala,Apache Spark,我有拼花文件,其中包含id,功能。当我使用铸造错误 强制转换ArrayTypeDoubleType,true为DoubleType 出现此错误 cannot resolve 'CAST(`features` AS DOUBLE)' due to data type mismatch: cannot cast ArrayType(DoubleType,true) to DoubleType; line 1 pos 0; 如何解决 编辑后出现错误 java.lang.ClassCastExcep

我有拼花文件,其中包含id,功能。当我使用铸造错误

强制转换ArrayTypeDoubleType,true为DoubleType

出现此错误

cannot resolve 'CAST(`features` AS DOUBLE)' due to data type mismatch: cannot cast ArrayType(DoubleType,true) to DoubleType; line 1 pos 0;
如何解决

编辑后出现错误

java.lang.ClassCastException:scala.collection.mutable.WrappedArray$ofRef不能强制转换为[D]


features列包含DoubleType数组,因此无法将其强制转换为DoubleType。请使用Vectors.dense将此列转换为向量。然后对包含double和Vector的列使用VectorAssembler

大概

val training = spark.read.parquet("/usr/local/spark/dataset/data/user")
 val df = training.map{ r =>
   (Vectors.dense(r.getAs[Array[Double]]("features")),r.getAs[Double]("id"))
  }.toDF("features","id")
 val assembler = new VectorAssembler().setInputCols(Array("features" )).setOutputCol("feature")
 val data = assembler.transform(df)

features列包含DoubleType数组,因此无法将其强制转换为DoubleType。是的,但出现错误scala.collection.mutable.WrappedArray$ofRef无法强制转换为[Double]你能用可用于重现错误的数据更新问题吗?显示模式。基本上,你只需要匹配数据类型来提取双精度,将双精度转换为向量,然后将其提供给VectorAssembler。
val training = spark.read.parquet("/usr/local/spark/dataset/data/user")
 val df = training.map{ r =>
   (Vectors.dense(r.getAs[Array[Double]]("features")),r.getAs[Double]("id"))
  }.toDF("features","id")
 val assembler = new VectorAssembler().setInputCols(Array("features" )).setOutputCol("feature")
 val data = assembler.transform(df)