Scala 无法将函数转换为UDF
免责声明:我是全新的Scala 以下是函数:Scala 无法将函数转换为UDF,scala,apache-spark,user-defined-functions,Scala,Apache Spark,User Defined Functions,免责声明:我是全新的Scala 以下是函数: def getSuggestedTests (df: DataFrame) : DataFrame = { // We ask deequ to compute constraint suggestions for us on the data val suggestionResult = { ConstraintSuggestionRunner() // data to suggest constraints for
def getSuggestedTests (df: DataFrame) : DataFrame = {
// We ask deequ to compute constraint suggestions for us on the data
val suggestionResult = { ConstraintSuggestionRunner()
// data to suggest constraints for
.onData(df)
// default set of rules for constraint suggestion
.addConstraintRules(Rules.DEFAULT)
// run data profiling and constraint suggestion
.run()
}
// We can now investigate the constraints that Deequ suggested.
val suggestionDataFrame = suggestionResult.constraintSuggestions.flatMap {
case (column, suggestions) =>
suggestions.map { constraint =>
(column, constraint.description, constraint.codeForConstraint)
}
}.toSeq.toDF()
return suggestionDataFrame
}
根据我目前的理解,UDF应按如下方式创建:
val getSuggestedTestsUdf = udf(getSuggestedTests(_: DataFrame))
我得到以下错误:
An error was encountered:
java.lang.UnsupportedOperationException: Schema for type org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] is not supported
不需要使用自定义项。只需调用
getsuggestests
,作为接受数据帧的正常函数即可。UDF在行上工作,而不是在数据帧上。