Apache spark 接受StructType或“数组”类型列的UDF的输入类型应该是什么；空“；？_Apache Spark_Apache Spark Sql_User Defined Functions

Apache spark 接受StructType或“数组”类型列的UDF的输入类型应该是什么；空“；？

apache-spark

Apache spark 接受StructType或“数组”类型列的UDF的输入类型应该是什么；空“；？,apache-spark,apache-spark-sql,user-defined-functions,Apache Spark,Apache Spark Sql,User Defined Functions,我的数据框架的架构如下所示：根目录 |--col1:string（nullable=true） |--col2:array（nullable=true） ||--元素：struct（containsnall=true） || |--unit1:string（nullable=true） || |--sum（unit2）：字符串（nullable=true） || |--max（unit3）：字符串（nullable=true） |--col3:数组（nullable=true） ||--元素：

我的数据框架的架构如下所示：

根目录
|--col1:string（nullable=true）
|--col2:array（nullable=true）
||--元素：struct（containsnall=true）
|| |--unit1:string（nullable=true）
|| |--sum（unit2）：字符串（nullable=true）
|| |--max（unit3）：字符串（nullable=true）
|--col3:数组（nullable=true）
||--元素：struct（containsnall=true）
|| |--unit1:string（nullable=true）
|| |--sum（unit2）：字符串（nullable=true）
|| |--max（unit3）：字符串（nullable=true）

我正在用scala编写一个UDF，它接受cols-col2和col3。
考虑到col2的值可以是“null”

val-process\u-stuff=udf（（col2:？，col3:？）=>{

到目前为止，我已经尝试过这个和我的其他东西

val-process\u-stuff=udf（（col2:ArrayType[StructType[StructField]]，col3:ArrayType[StructType[StructField]]）=>{

但它在这里和那里给了我警告

请帮助！

您的自定义项应具有以下签名：

val process_stuff = udf((col2: Seq[Row], col3: Seq[Row]) => {...})

不起作用。在UDF中，当我访问unit1:时出现类型不匹配错误，显示“find:Any，required:String”，尝试将Seq[Row]替换为Seq[String]，它应该会起作用。