Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 火花填充数据帧,矢量为空_Scala_Apache Spark_Dataframe_Vector_Null - Fatal编程技术网

Scala 火花填充数据帧,矢量为空

Scala 火花填充数据帧,矢量为空,scala,apache-spark,dataframe,vector,null,Scala,Apache Spark,Dataframe,Vector,Null,我有一个包含VectorAssembler创建的特征向量的数据帧,它也包含空值。现在,我想用向量替换空值: val nil = Vectors.dense(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0) df.na.fill(nil) // does not work. 正确的方法是什么 编辑: 多亏了这个答案,我找到了一条路: val nil

我有一个包含VectorAssembler创建的特征向量的数据帧,它也包含空值。现在,我想用向量替换空值:

 val nil = Vectors.dense(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0)

df.na.fill(nil) // does not work.
正确的方法是什么

编辑: 多亏了这个答案,我找到了一条路:

val nil = Vectors.dense(1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0,1.0, 1.0, 1.0, 1.0, 1.0)

import sc.implicits._
var fill = Seq(Tuple1(nil)).toDF("replacement")

val dates = data.schema.fieldNames.filter(e => e.contains("1"))

data = data.crossJoin(broadcast(fill))
for(e <- dates){
  data = data.withColumn(e, coalesce(data.col(e), $"replacement"))
}
data = data.drop("replacement")
val nil=Vectors.density(1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0)
导入sc.ITS_
var fill=序列(元组1(零)).toDF(“替换”)
val dates=data.schema.fieldNames.filter(e=>e.contains(“1”))
数据=数据交叉连接(广播(填充))

对于(e,如果问题是通过添加一些与替换连接的额外行而产生的:

import org.apache.spark.sql.functions._

val df = Seq((1, None), (2, Some(nil))).toDF("id", "vector")
val fill = Seq(Tuple1(nil)).toDF("replacement")

df.crossJoin(broadcast(fill)).withColumn("vector", coalesce($"vector", $"replacement")).drop("replacement")