Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/17.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在spark scala中重命名结构内部的列_Scala_Apache Spark_Apache Spark Sql - Fatal编程技术网

如何在spark scala中重命名结构内部的列

如何在spark scala中重命名结构内部的列,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,我有一个数据框。是这样的- |-- Col1 : string (nullable = true) |-- Col2 : string (nullable = true) |-- Col3 : struct (nullable = true) | |-- 513: long (nullable = true) | |-- 549: long (nullable = true) 利用- df.select("Col1","Col2","Col3.*").show +---

我有一个数据框。是这样的-

 |-- Col1 : string (nullable = true)
 |-- Col2 : string (nullable = true)
 |-- Col3 : struct (nullable = true)
 |    |-- 513: long (nullable = true)
 |    |-- 549: long (nullable = true)
利用-

df.select("Col1","Col2","Col3.*").show

+-----------+--------+------+------+
|       Col1|    Col1|   513|   549|
+-----------+--------+------+------+
| AAAAAAAAA |  BBBBB |    39|    38|
+-----------+--------+------+------+
现在我想重新命名它

    +-----------+--------+---------+--------+
    |       Col1|    Col1| Col3=513|Col3=549|
    +-----------+--------+---------+--------+
    | AAAAAAAAA |  BBBBB |       39|      38|
    +-----------+--------+---------+--------+

struct中的列是动态的。因此,我不能将
与ColumnRenamed一起使用

当您询问如何重命名Insode structs时,您可以使用Schema DSL实现这一点:

import org.apache.spark.sql.types._

val schema: StructType = df.schema.fields.find(_.name=="Col3").get.dataType.asInstanceOf[StructType]
val newSchema = StructType.apply(schema.fields.map(sf => StructField.apply("Col3="+sf.name,sf.dataType)))

df
  .withColumn("Col3",$"Col3".cast(newSchema))
  .printSchema()
给予

root
 |-- Col1: string (nullable = true)
 |-- Col2: string (nullable = true)
 |-- Col3: struct (nullable = false)
 |    |-- Col3=513: long (nullable = true)
 |    |-- Col3=549: long (nullable = true)
然后您可以使用
选择($“col3.*)
解压它


您还可以先解压结构,然后重命名所有列,这些列的编号为列名…

当您询问重命名insude结构时,您可以使用模式DSL实现这一点:

import org.apache.spark.sql.types._

val schema: StructType = df.schema.fields.find(_.name=="Col3").get.dataType.asInstanceOf[StructType]
val newSchema = StructType.apply(schema.fields.map(sf => StructField.apply("Col3="+sf.name,sf.dataType)))

df
  .withColumn("Col3",$"Col3".cast(newSchema))
  .printSchema()
给予

root
 |-- Col1: string (nullable = true)
 |-- Col2: string (nullable = true)
 |-- Col3: struct (nullable = false)
 |    |-- Col3=513: long (nullable = true)
 |    |-- Col3=549: long (nullable = true)
然后您可以使用
选择($“col3.*)
解压它


您还可以先解压结构,然后重命名所有列,这些列的编号为列名…

它怎么可能是动态的?如果我有两个同名的列,如果我想将其重命名为_1和_2,它怎么可能是动态的?如果我有两个同名的列,如果我想将其重命名为_1和_2