Scala 如何在分组后将值聚合到映射列表?
我有一张像这样的桌子Scala 如何在分组后将值聚合到映射列表?,scala,dataframe,apache-spark,apache-spark-sql,Scala,Dataframe,Apache Spark,Apache Spark Sql,我有一张像这样的桌子 id | fruit | buy_time ------------------------ 1 | apple | 100 1 | banana | 105 2 | grapes | 102 2 | orange | 101 2 | apple | 110 我的预期输出(按id分组的地图列表) 将.groupBy与一起使用,以使用json(Spark-2.4+)+收集列表+结构函数 示例: import org.apache.s
id | fruit | buy_time
------------------------
1 | apple | 100
1 | banana | 105
2 | grapes | 102
2 | orange | 101
2 | apple | 110
我的预期输出(按id分组的地图列表)
将
.groupBy
与一起使用,以使用json(Spark-2.4+)+收集列表+结构
函数
示例:
import org.apache.spark.sql.functions._
val df=Seq((1,"apple",100),(1,"banana",105),(2,"grapes",102),(2,"orange",101),(2,"apple",101)).toDF("id","fruit","buy_time")
df.groupBy("id").agg(to_json(collect_list(struct(col("fruit"),col("buy_time").alias("time")))).alias("buy_info")).show(10,false)
//+---+------------------------------------------------------------------------------------------+
//|id |buy_info |
//+---+------------------------------------------------------------------------------------------+
//|1 |[{"fruit":"apple","time":100},{"fruit":"banana","time":105}] |
//|2 |[{"fruit":"grapes","time":102},{"fruit":"orange","time":101},{"fruit":"apple","time":101}]|
//+---+------------------------------------------------------------------------------------------+
事实证明,“购买信息”的类型保存为字符串,而不是“地图数组”。你有更多的技巧来分享如何把它变成“地图数组”格式吗?
import org.apache.spark.sql.functions._
val df=Seq((1,"apple",100),(1,"banana",105),(2,"grapes",102),(2,"orange",101),(2,"apple",101)).toDF("id","fruit","buy_time")
df.groupBy("id").agg(to_json(collect_list(struct(col("fruit"),col("buy_time").alias("time")))).alias("buy_info")).show(10,false)
//+---+------------------------------------------------------------------------------------------+
//|id |buy_info |
//+---+------------------------------------------------------------------------------------------+
//|1 |[{"fruit":"apple","time":100},{"fruit":"banana","time":105}] |
//|2 |[{"fruit":"grapes","time":102},{"fruit":"orange","time":101},{"fruit":"apple","time":101}]|
//+---+------------------------------------------------------------------------------------------+