如何将scala中的列表列扩展为多行
我想翻一下下面的清单:如何将scala中的列表列扩展为多行,scala,databricks,Scala,Databricks,我想翻一下下面的清单: val articledDF = spark.createDF( List( ("article 1", Array("topic 1", "topic 2")), ("article 2", Array("topic 1", "topic 3")), ("article 3", Array("topic 2")) ), List( ("article", StringType, true), ("topics", Arra
val articledDF = spark.createDF(
List(
("article 1", Array("topic 1", "topic 2")),
("article 2", Array("topic 1", "topic 3")),
("article 3", Array("topic 2"))
), List(
("article", StringType, true),
("topics", ArrayType(StringType, true), true)
)
)
其结果是:
+---------+---------------------+
| name |topics |
+---------+---------------------+
|article 1| [topic 1, topic 2]|
|article 2| [topic 1, topic 3]|
|article 3| [topic 2]|
+---------+---------------------+
并按以下方式展开列主题:
+---------+-----------+
| name |topic |
+---------+-----------+
|article 1| topic 1 |
|article 1| topic 2 |
|article 2| topic 1 |
|article 2| topic 3 |
|article 3| topic 2 |
+---------+-----------+
我很乐意学习如何做到这一点。使用
explode
:
import org.apache.spark.sql.functions._
import spark.implicits._
articledDF.select($"article", explode($"topics") as "topic")
使用
分解
:
import org.apache.spark.sql.functions._
import spark.implicits._
articledDF.select($"article", explode($"topics") as "topic")