Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala 无法将函数转换为UDF_Scala_Apache Spark_User Defined Functions - Fatal编程技术网

Scala 无法将函数转换为UDF

Scala 无法将函数转换为UDF,scala,apache-spark,user-defined-functions,Scala,Apache Spark,User Defined Functions,免责声明:我是全新的Scala 以下是函数: def getSuggestedTests (df: DataFrame) : DataFrame = { // We ask deequ to compute constraint suggestions for us on the data val suggestionResult = { ConstraintSuggestionRunner() // data to suggest constraints for

免责声明:我是全新的Scala

以下是函数:

def getSuggestedTests (df: DataFrame) : DataFrame = {
    // We ask deequ to compute constraint suggestions for us on the data
    val suggestionResult = { ConstraintSuggestionRunner()
      // data to suggest constraints for
      .onData(df)
      // default set of rules for constraint suggestion
      .addConstraintRules(Rules.DEFAULT)
      // run data profiling and constraint suggestion
      .run()
    }

    // We can now investigate the constraints that Deequ suggested. 
    val suggestionDataFrame = suggestionResult.constraintSuggestions.flatMap { 
      case (column, suggestions) => 
        suggestions.map { constraint =>
          (column, constraint.description, constraint.codeForConstraint)
        } 
    }.toSeq.toDF()
    
    return suggestionDataFrame
}
根据我目前的理解,UDF应按如下方式创建:

val getSuggestedTestsUdf = udf(getSuggestedTests(_: DataFrame))
我得到以下错误:

An error was encountered:
java.lang.UnsupportedOperationException: Schema for type org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] is not supported

不需要使用自定义项。只需调用
getsuggestests
,作为接受数据帧的正常函数即可。UDF在行上工作,而不是在数据帧上。