Apache spark 如何在Spark SQL结构中为列添加别名

Apache spark 如何在Spark SQL结构中为列添加别名,apache-spark,apache-spark-sql,Apache Spark,Apache Spark Sql,返回具有以下架构的数据帧 sparkSession.sql("select struct(col1,col2) as myStruct from table1") 但我需要col1作为myCL1,col2作为myCL2 当我在struct函数中使用as关键字时,它失败了 root |-- myStruct : struct (nullable = false) | |-- col1: string (nullable = true) | |-- col2: string (n

返回具有以下架构的数据帧

sparkSession.sql("select struct(col1,col2) as myStruct from table1")
但我需要col1作为myCL1,col2作为myCL2

当我在struct函数中使用as关键字时,它失败了

root
 |-- myStruct : struct (nullable = false)
 |    |-- col1: string (nullable = true)
 |    |-- col2: string (nullable = true)
给出以下错误消息

sparkSession.sql("select struct(col1 as myCol1,col2 as myCol2) as myStruct from table1")

如何在struct field中获取列别名?

您可以在Spark 2.1.0中对创建的DF进行尝试

mismatched input 'as' expecting {')', ','}(line 1, pos 19)

您使用哪种版本的Spark?我刚刚试用了2.3,效果很好:scala>sqlselect structuser作为l,movie作为p作为表中的mystruct.printSchema root |-mystruct:struct nullable=false | |-l:integer nullable=true | |-p:integer nullable=trueSpark 2.1.0版本也不起作用。。外部输入“$”应为{,'SELECT','FROM','ADD','AS','ALL','DISTINCT','WHERE','GROUP','BY','GROUPING','SETS','CUBE','ROLLUP','ORDER',…..,您必须导入spark.implicits.\然后才能使用“$”引用列
val newDF = oldDF.withColumn("MyCol",struct($"myCol.col1".alias("myCol1"),$"myCol.col2".alias("myCol2"))).drop("myCol").withColumnRenamed("MyCol","myCol")