Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 使用when函数创建客户标尺_Scala_Apache Spark_Apache Spark Sql - Fatal编程技术网

Scala 使用when函数创建客户标尺

Scala 使用when函数创建客户标尺,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,我正在尝试使用“when”函数创建自定义规则,以便最终将它们应用于数据帧的一列。其中许多规则将应用于不同的列,但我们的想法不是为每一列编写它们,而是将它们存储在一个变量中并连接起来。例如,我有以下几点: df .withColumn("campoOut1",when(col("campo1") === "G" && col("campo2") === "00", "001

我正在尝试使用“when”函数创建自定义规则,以便最终将它们应用于数据帧的一列。其中许多规则将应用于不同的列,但我们的想法不是为每一列编写它们,而是将它们存储在一个变量中并连接起来。例如,我有以下几点:

df
.withColumn("campoOut1",when(col("campo1") === "G" && col("campo2") === "00", "001"))
.withColumn("campoOut2",
    when(col("campo1") === "G" && col("campo2") === "00", "001").
    when(col("campo3") === "G" && col("campo4") =!= "00", "002"))
我希望实现以下目标:

val ruler1 = when(col("campo1") === "G" && col("campo2") === "00", "001")
val ruler2 = when(col("campo3") === "G" && col("campo4") =!= "00", "002")

 df.withColumn("campoOut1",ruler1)
   .withColumn("campoOut2",ruler1 + ruler2)
我没有成功,因为变量ruler1和ruler2不是“string”类型,你知道怎么做吗


事先非常感谢

您可以递归链接规则:

def chainRules(rules: (Column, String)*) = {
     def go(rules: Seq[(Column, String)], chained: Column): Column = {
        if (rules.isEmpty) {
           return chained
        }
        go(rules.tail, chained.when(rules.head._1, rules.head._2))
     }
     go(rules.tail, when(rules.head._1, rules.head._2))
  }
但你需要像这样调整你的规则:

val rule1 = (col("campo1") === "G" && col("campo2") === "00", "001")
val rule2 = (col("campo3") === "G" && col("campo4") =!= "00", "002")
 df.withColumn("campoOut1", chainRules(rule1))
   .withColumn("campoOut2", chainRules(rule1, rule2))
你可以这样使用它:

val rule1 = (col("campo1") === "G" && col("campo2") === "00", "001")
val rule2 = (col("campo3") === "G" && col("campo4") =!= "00", "002")
 df.withColumn("campoOut1", chainRules(rule1))
   .withColumn("campoOut2", chainRules(rule1, rule2))

您可以递归地链接规则:

def chainRules(rules: (Column, String)*) = {
     def go(rules: Seq[(Column, String)], chained: Column): Column = {
        if (rules.isEmpty) {
           return chained
        }
        go(rules.tail, chained.when(rules.head._1, rules.head._2))
     }
     go(rules.tail, when(rules.head._1, rules.head._2))
  }
但你需要像这样调整你的规则:

val rule1 = (col("campo1") === "G" && col("campo2") === "00", "001")
val rule2 = (col("campo3") === "G" && col("campo4") =!= "00", "002")
 df.withColumn("campoOut1", chainRules(rule1))
   .withColumn("campoOut2", chainRules(rule1, rule2))
你可以这样使用它:

val rule1 = (col("campo1") === "G" && col("campo2") === "00", "001")
val rule2 = (col("campo3") === "G" && col("campo4") =!= "00", "002")
 df.withColumn("campoOut1", chainRules(rule1))
   .withColumn("campoOut2", chainRules(rule1, rule2))

美好的一个问题,括号后面的星号是什么意思?规则:(列,字符串)*变量参数的scalas版本
defx(vars:String*)=println(x)
允许您这样使用x:
x(“多”、“参数”、“in”、“here”)
在函数本身中它将是一个
WrappedArray
我明白了,有没有其他方法不使用“return”来编写代码?intellij Idea检测到它是一个错误的做法将第二个条件包装在一个else中,这样您就可以删除returnNice。一个问题,括号后面的星号是什么意思?规则:(列,字符串)*变量参数的scalas版本
defx(vars:String*)=println(x)
允许您这样使用x:
x(“多”、“参数”、“in”、“here”)
在函数本身中它将是一个
WrappedArray
我明白了,有没有其他方法不使用“return”来编写代码?intellij Idea检测到它是一个错误的做法将第二个条件包装在一个else中,这样您就可以删除返回