Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Scala spark中数据帧的行操作_Scala_Apache Spark_Dataframe_Apache Spark Sql - Fatal编程技术网

Scala spark中数据帧的行操作

Scala spark中数据帧的行操作,scala,apache-spark,dataframe,apache-spark-sql,Scala,Apache Spark,Dataframe,Apache Spark Sql,我在spark中有一个数据帧,类似于: column_A | column_B --------- -------- 1 1,12,21 2 6,9 colum_new_A | column_new_B ----------- ------------ 1 1 1 12 1 21 2 6

我在spark中有一个数据帧,类似于:

 column_A | column_B
 ---------  --------
  1          1,12,21
  2          6,9
  colum_new_A | column_new_B
  -----------   ------------
     1             1
     1             12
     1             21
     2             6
     2             9
列A和列B都是字符串类型

如何将上述数据帧转换为新的数据帧,如下所示:

 column_A | column_B
 ---------  --------
  1          1,12,21
  2          6,9
  colum_new_A | column_new_B
  -----------   ------------
     1             1
     1             12
     1             21
     2             6
     2             9
列A和列B都应为字符串类型。

您需要使用逗号拆分列B,并使用分解函数作为

val df = Seq(
  ("1", "1,12,21"),
  ("2", "6,9")
).toDF("column_A", "column_B")
您可以使用withColumn或select创建新列

输出:

+------------+------------+
|column_new_A|column_new_B|
+------------+------------+
|1           |1           |
|1           |12          |
|1           |21          |
|2           |6           |
|2           |9           |
+------------+------------+