Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/json/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Json Spark转置数据帧列2raw而不使用RDD_Json_Scala_Apache Spark_Apache Spark Sql_Spark Dataframe - Fatal编程技术网

Json Spark转置数据帧列2raw而不使用RDD

Json Spark转置数据帧列2raw而不使用RDD,json,scala,apache-spark,apache-spark-sql,spark-dataframe,Json,Scala,Apache Spark,Apache Spark Sql,Spark Dataframe,我有一个JSON文件,格式如下 {"sku-1":{"att-a":"att-a-7","att-b":"att-b-3","att-c":"att-c-10","att-d":"att-d-10","att-e":"att-e-15","att-f":"att-f-11","att-g":"att-g-2","att-h":"att-h-7","att-i":"att-i-5","att-j":"att-j-14"},"sku-2":{"att-a":"att-a-9","att-b":"at

我有一个JSON文件,格式如下

{"sku-1":{"att-a":"att-a-7","att-b":"att-b-3","att-c":"att-c-10","att-d":"att-d-10","att-e":"att-e-15","att-f":"att-f-11","att-g":"att-g-2","att-h":"att-h-7","att-i":"att-i-5","att-j":"att-j-14"},"sku-2":{"att-a":"att-a-9","att-b":"att-b-7","att-c":"att-c-12","att-d":"att-d-4","att-e":"att-e-10","att-f":"att-f-4","att-g":"att-g-13","att-h":"att-h-4","att-i":"att-i-1","att-j":"att-j-13"},"sku-3":{"att-a":"att-a-10","att-b":"att-b-6","att-c":"att-c-1","att-d":"att-d-1","att-e":"att-e-13","att-f":"att-f-12","att-g":"att-g-9","att-h":"att-h-6","att-i":"att-i-7","att-j":"att-j-4"}}

我需要用下面的新结构将其读入Spark Dataframe

我也试着读它如下

    val schema = (new StructType)
    .add("SKUNAME", (new StructType)
      .add("att-a", StringType)
      .add("att-b", StringType)
      .add("att-c", StringType)
      .add("att-d", StringType)
      .add("att-e", StringType)
      .add("att-f", StringType)
      .add("att-g", StringType)
      .add("att-h", StringType)
      .add("att-i", StringType)
      .add("att-j", StringType))

val recommendationInputDf = sparkSession.read.schema(schema).json(recommendationsPath)
下面是我上面代码的输出

模式

root
|-- SKUNAME: struct (nullable = true)
 |    |-- att-a: string (nullable = true)
 |    |-- att-b: string (nullable = true)
 |    |-- att-c: string (nullable = true)
 |    |-- att-d: string (nullable = true)
 |    |-- att-e: string (nullable = true)
 |    |-- att-f: string (nullable = true)
 |    |-- att-g: string (nullable = true)
 |    |-- att-h: string (nullable = true)
 |    |-- att-i: string (nullable = true)
 |    |-- att-j: string (nullable = true)
资料

我还检查了其他类似的问题,如()&(),但无法获得相同的输出

从评论中,我已经检查了以下建议的解决方案

 def toLong(df: DataFrame, by: Seq[String]): DataFrame = {
  val (cols, types) = df.dtypes.filter { case (c, _) => !by.contains(c) }.unzip
  require(types.distinct.size == 1)

  val kvs = explode(array(
    cols.map(c => struct(lit(c).alias("key"), col(c).alias("val"))): _*))

  val byExprs = by.map(col(_))
  import sparkSession.sqlContext.implicits._
  df
    .select(byExprs :+ kvs.alias("_kvs"): _*)
    .select(byExprs ++ Seq($"_kvs.key", $"_kvs.val"): _*)
}

toLong(recommendationInputDf, Seq("sku-1")).show(12, false)
但输出如下:

+--------------------------------------------------------------------------------------+-----+-------------------------------------------------------------------------------------+
|sku-1                                                                                 |key  |val                                                                                  |
+--------------------------------------------------------------------------------------+-----+-------------------------------------------------------------------------------------+
|[att-a-7,att-b-3,att-c-10,att-d-10,att-e-15,att-f-11,att-g-2,att-h-7,att-i-5,att-j-14]|sku-2|[att-a-9,att-b-7,att-c-12,att-d-4,att-e-10,att-f-4,att-g-13,att-h-4,att-i-1,att-j-13]|
|[att-a-7,att-b-3,att-c-10,att-d-10,att-e-15,att-f-11,att-g-2,att-h-7,att-i-5,att-j-14]|sku-3|[att-a-10,att-b-6,att-c-1,att-d-1,att-e-13,att-f-12,att-g-9,att-h-6,att-i-7,att-j-4] |
+--------------------------------------------------------------------------------------+-----+-------------------------------------------------------------------------------------+

按照zero323回答中的说明进行操作:

+--------------------------------------------------------------------------------------+-----+-------------------------------------------------------------------------------------+
|sku-1                                                                                 |key  |val                                                                                  |
+--------------------------------------------------------------------------------------+-----+-------------------------------------------------------------------------------------+
|[att-a-7,att-b-3,att-c-10,att-d-10,att-e-15,att-f-11,att-g-2,att-h-7,att-i-5,att-j-14]|sku-2|[att-a-9,att-b-7,att-c-12,att-d-4,att-e-10,att-f-4,att-g-13,att-h-4,att-i-1,att-j-13]|
|[att-a-7,att-b-3,att-c-10,att-d-10,att-e-15,att-f-11,att-g-2,att-h-7,att-i-5,att-j-14]|sku-3|[att-a-10,att-b-6,att-c-1,att-d-1,att-e-13,att-f-12,att-g-9,att-h-6,att-i-7,att-j-4] |
+--------------------------------------------------------------------------------------+-----+-------------------------------------------------------------------------------------+
val df = spark.read.json(spark.createDataset(Seq(
  """{"sku-1":{"att-a":"att-a-7","att-b":"att-b-3","att-c":"att-c-10","att-d":"att-d-10","att-e":"att-e-15","att-f":"att-f-11","att-g":"att-g-2","att-h":"att-h-7","att-i":"att-i-5","att-j":"att-j-14"},"sku-2":{"att-a":"att-a-9","att-b":"att-b-7","att-c":"att-c-12","att-d":"att-d-4","att-e":"att-e-10","att-f":"att-f-4","att-g":"att-g-13","att-h":"att-h-4","att-i":"att-i-1","att-j":"att-j-13"},"sku-3":{"att-a":"att-a-10","att-b":"att-b-6","att-c":"att-c-1","att-d":"att-d-1","att-e":"att-e-13","att-f":"att-f-12","att-g":"att-g-9","att-h":"att-h-6","att-i":"att-i-7","att-j":"att-j-4"}}"""
)))

toLong(df, Seq()).select($"key".alias("sku"), $"val.*").show
+-----+--------+-------+--------+--------+--------+--------+--------+-------+-------+--------+
|  sku|   att-a|  att-b|   att-c|   att-d|   att-e|   att-f|   att-g|  att-h|  att-i|   att-j|
+-----+--------+-------+--------+--------+--------+--------+--------+-------+-------+--------+
|sku-1| att-a-7|att-b-3|att-c-10|att-d-10|att-e-15|att-f-11| att-g-2|att-h-7|att-i-5|att-j-14|
|sku-2| att-a-9|att-b-7|att-c-12| att-d-4|att-e-10| att-f-4|att-g-13|att-h-4|att-i-1|att-j-13|
|sku-3|att-a-10|att-b-6| att-c-1| att-d-1|att-e-13|att-f-12| att-g-9|att-h-6|att-i-7| att-j-4|
+-----+--------+-------+--------+--------+--------+--------+--------+-------+-------+--------+