Scala 如何基于case类动态重命名Spark DF中的列
尝试基于case类(JSON文件)重命名现有列 例: 样本源DFScala 如何基于case类动态重命名Spark DF中的列,scala,apache-spark,Scala,Apache Spark,尝试基于case类(JSON文件)重命名现有列 例: 样本源DF val empDf = Seq( (1,10,"IT","John"), (2,20,"DEV","Ed"), (2,30,"OPS","Brian") ).toDF("DEPTID","EMPID","DEP
val empDf = Seq(
(1,10,"IT","John"),
(2,20,"DEV","Ed"),
(2,30,"OPS","Brian")
).toDF("DEPTID","EMPID","DEPT_NAME","EMPNAME")
基于上述配置,您希望重命名DF列。因此,EMPID重命名为EMP_ID,EPNAME重命名为EMP_NAME,DEPTID重命名为DEPT_ID。您可以这样做:
import org.apache.spark.sql.Column
import org.apache.spark.sql.functions._
val selectExpr : Seq[Column] = newOb
.sortBy(_.colOrder)
.map(om => col(om.colName).as(om.renameCol.getOrElse(om.colName)))
empDf
.select(selectExpr:_*)
.show()
给出:
+-------+---------+--------+------+
|DEPT_ID|DEPT_NAME|EMP_NAME|EMP_ID|
+-------+---------+--------+------+
| 1| IT| John| 10|
| 2| DEV| Ed| 20|
| 2| OPS| Brian| 30|
+-------+---------+--------+------+
您可以这样做:
import org.apache.spark.sql.Column
import org.apache.spark.sql.functions._
val selectExpr : Seq[Column] = newOb
.sortBy(_.colOrder)
.map(om => col(om.colName).as(om.renameCol.getOrElse(om.colName)))
empDf
.select(selectExpr:_*)
.show()
给出:
+-------+---------+--------+------+
|DEPT_ID|DEPT_NAME|EMP_NAME|EMP_ID|
+-------+---------+--------+------+
| 1| IT| John| 10|
| 2| DEV| Ed| 20|
| 2| OPS| Brian| 30|
+-------+---------+--------+------+