Scala 如何在分组后将值聚合到映射列表?

Scala 如何在分组后将值聚合到映射列表?,scala,dataframe,apache-spark,apache-spark-sql,Scala,Dataframe,Apache Spark,Apache Spark Sql,我有一张像这样的桌子 id | fruit | buy_time ------------------------ 1 | apple | 100 1 | banana | 105 2 | grapes | 102 2 | orange | 101 2 | apple | 110 我的预期输出(按id分组的地图列表) 将.groupBy与一起使用,以使用json(Spark-2.4+)+收集列表+结构函数 示例: import org.apache.s

我有一张像这样的桌子

id  | fruit  | buy_time
------------------------
1   | apple  | 100
1   | banana | 105        
2   | grapes | 102
2   | orange | 101
2   | apple | 110
我的预期输出(按id分组的地图列表)


.groupBy
一起使用,以使用json(Spark-2.4+)+收集列表+结构
函数

示例:

import org.apache.spark.sql.functions._
val df=Seq((1,"apple",100),(1,"banana",105),(2,"grapes",102),(2,"orange",101),(2,"apple",101)).toDF("id","fruit","buy_time")

df.groupBy("id").agg(to_json(collect_list(struct(col("fruit"),col("buy_time").alias("time")))).alias("buy_info")).show(10,false)
//+---+------------------------------------------------------------------------------------------+
//|id |buy_info                                                                                  |
//+---+------------------------------------------------------------------------------------------+
//|1  |[{"fruit":"apple","time":100},{"fruit":"banana","time":105}]                              |
//|2  |[{"fruit":"grapes","time":102},{"fruit":"orange","time":101},{"fruit":"apple","time":101}]|
//+---+------------------------------------------------------------------------------------------+
事实证明,“购买信息”的类型保存为字符串,而不是“地图数组”。你有更多的技巧来分享如何把它变成“地图数组”格式吗?
import org.apache.spark.sql.functions._
val df=Seq((1,"apple",100),(1,"banana",105),(2,"grapes",102),(2,"orange",101),(2,"apple",101)).toDF("id","fruit","buy_time")

df.groupBy("id").agg(to_json(collect_list(struct(col("fruit"),col("buy_time").alias("time")))).alias("buy_info")).show(10,false)
//+---+------------------------------------------------------------------------------------------+
//|id |buy_info                                                                                  |
//+---+------------------------------------------------------------------------------------------+
//|1  |[{"fruit":"apple","time":100},{"fruit":"banana","time":105}]                              |
//|2  |[{"fruit":"grapes","time":102},{"fruit":"orange","time":101},{"fruit":"apple","time":101}]|
//+---+------------------------------------------------------------------------------------------+