Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Spark scala-将列值与参数进行比较_Scala_Apache Spark_Dataframe - Fatal编程技术网

Spark scala-将列值与参数进行比较

Spark scala-将列值与参数进行比较,scala,apache-spark,dataframe,Scala,Apache Spark,Dataframe,要将dataframe列与值进行比较。我试图转换值并使用lit(),但没有任何结果。下面我附上了我的硬编码版本,但它不能满足要求 object Analyzer { def main(args: Array[String]): Unit = { // my lav used to comare with column val minEfficiency: Double = 0.9 // I would like compare column with declar

要将dataframe列与值进行比较。我试图转换值并使用
lit()
,但没有任何结果。下面我附上了我的硬编码版本,但它不能满足要求

object Analyzer {
  def main(args: Array[String]): Unit = {

    // my lav used to comare with column
    val minEfficiency: Double = 0.9

    // I would like compare column with declared val
    // here is hardcoded (poor) version
    val metrics = dataframe.withColumn("State",
      when($"Efficiency" >= 0.9, "ok").otherwise("not ok")
    )

  }
}
数据帧信息:

scala> dataframe.printSchema()
root
 |-- SensorId: integer (nullable = true)
 |--  Efficiency: double (nullable = true)
scala> dataframe.show()
+--------+-----------+
|SensorId| Efficiency|
+--------+-----------+
|       1|      0.356|
|       2|       0.99|
|       3|        1.0|
|       4|      0.256|
|       5|        0.9|
+--------+-----------+

您还可以使用
转换
执行以下操作:

import org.apache.spark.sql.functions._
import org.apache.spark.sql._

val df = Seq(10,0.9,-1,0.3).toDF("Efficiency")
val minEfficiency = 0.9

def withMinEfficiency(minValue: Double)(df: DataFrame): DataFrame = {
  df.withColumn("State", when('Efficiency >= minValue,"Ok").otherwise("Not Ok"))
}

df.transform(withMinEfficiency(minEfficiency)).show(false)
输出:

+----------+------+
|Efficiency|State |
+----------+------+
|10.0      |Ok    |
|0.9       |Ok    |
|-1.0      |Not Ok|
|0.3       |Not Ok|
+----------+------+     

您是否尝试过使用

dataframe.withColumn("State", when(col("Efficiency") >= lit(minEfficiency), "ok").otherwise("not ok"))

你能添加一些数据并说明清楚吗,不能得到你所要求的,你是说这不起作用吗<代码>当($“效率”>=Mineficiency,“正常”)。否则(“不正常”)
这如何
dataframe.withColumn(“State”,当(col(“效率”)>=lit(minEfficiency),“ok”)时。否则(“notok”)
您会收到什么样的错误消息?@astro_asz,您使用col()的解决方案非常完美!请将您的评论添加到回答部分。