Spark scala-将列值与参数进行比较
要将dataframe列与值进行比较。我试图转换值并使用Spark scala-将列值与参数进行比较,scala,apache-spark,dataframe,Scala,Apache Spark,Dataframe,要将dataframe列与值进行比较。我试图转换值并使用lit(),但没有任何结果。下面我附上了我的硬编码版本,但它不能满足要求 object Analyzer { def main(args: Array[String]): Unit = { // my lav used to comare with column val minEfficiency: Double = 0.9 // I would like compare column with declar
lit()
,但没有任何结果。下面我附上了我的硬编码版本,但它不能满足要求
object Analyzer {
def main(args: Array[String]): Unit = {
// my lav used to comare with column
val minEfficiency: Double = 0.9
// I would like compare column with declared val
// here is hardcoded (poor) version
val metrics = dataframe.withColumn("State",
when($"Efficiency" >= 0.9, "ok").otherwise("not ok")
)
}
}
数据帧信息:
scala> dataframe.printSchema()
root
|-- SensorId: integer (nullable = true)
|-- Efficiency: double (nullable = true)
scala> dataframe.show()
+--------+-----------+
|SensorId| Efficiency|
+--------+-----------+
| 1| 0.356|
| 2| 0.99|
| 3| 1.0|
| 4| 0.256|
| 5| 0.9|
+--------+-----------+
您还可以使用
转换执行以下操作:
import org.apache.spark.sql.functions._
import org.apache.spark.sql._
val df = Seq(10,0.9,-1,0.3).toDF("Efficiency")
val minEfficiency = 0.9
def withMinEfficiency(minValue: Double)(df: DataFrame): DataFrame = {
df.withColumn("State", when('Efficiency >= minValue,"Ok").otherwise("Not Ok"))
}
df.transform(withMinEfficiency(minEfficiency)).show(false)
输出:
+----------+------+
|Efficiency|State |
+----------+------+
|10.0 |Ok |
|0.9 |Ok |
|-1.0 |Not Ok|
|0.3 |Not Ok|
+----------+------+
您是否尝试过使用列
dataframe.withColumn("State", when(col("Efficiency") >= lit(minEfficiency), "ok").otherwise("not ok"))
你能添加一些数据并说明清楚吗,不能得到你所要求的,你是说这不起作用吗<代码>当($“效率”>=Mineficiency,“正常”)。否则(“不正常”)
这如何dataframe.withColumn(“State”,当(col(“效率”)>=lit(minEfficiency),“ok”)时。否则(“notok”)
您会收到什么样的错误消息?@astro_asz,您使用col()的解决方案非常完美!请将您的评论添加到回答部分。