Apache spark 将列表[Map<;String,String>;]转换为spark数据帧
我想将List[Map]转换为spark数据帧,即Apache spark 将列表[Map<;String,String>;]转换为spark数据帧,apache-spark,apache-spark-sql,Apache Spark,Apache Spark Sql,我想将List[Map]转换为spark数据帧,即 Map是sname,Map的键是DataFrame的列 val map1 = {"a"->1} val map2 = {"b"->2} val lst = List(map1,map2) val lstDF = lst.toDF lstDF.take(2).foreach(println) 在这里你可以这样做 val map1 = {"a"->1} val map2 = {"b"->2} val lst = List(
Map是sname,Map的键是DataFrame的列
val map1 = {"a"->1}
val map2 = {"b"->2}
val lst = List(map1,map2)
val lstDF = lst.toDF
lstDF.take(2).foreach(println)
在这里你可以这样做
val map1 = {"a"->1}
val map2 = {"b"->2}
val lst = List(map1,map2)
val lstDF = lst.toDF
lstDF.take(2).foreach(println)
如果您已经有res,这是一个列表[Map[String,String]]:
res: List[Map[String,String]] = List(Map(A -> a1, B -> b1, C -> c1), Map(A -> a2, B -> b2, C -> c2))
您可以这样做来创建数据帧:
//create your rows
val rows = res.map(m => Row(m.values.toSeq:_*))
//create the schema from the header
val header = res.head.keys.toList
val schema = StructType(header.map(fieldName => StructField(fieldName, StringType, true)))
//create your rdd
val rdd = sc.parallelize(rows)
//create your dataframe using
val df = spark.createDataFrame(rdd, schema)
您可以使用df.show()输出结果。:
+---+---+---+
| A| B| C|
+---+---+---+
| a1| b1| c1|
| a2| b2| c2|
+---+---+---+
请注意,您也可以通过以下方式创建模式:
val schema = StructType(
List(
StructField("A", StringType, true),
StructField("B", StringType, true),
StructField("C", StringType, true)
)
)
如果您已经有res,这是一个列表[Map[String,String]]:
res: List[Map[String,String]] = List(Map(A -> a1, B -> b1, C -> c1), Map(A -> a2, B -> b2, C -> c2))
您可以这样做来创建数据帧:
//create your rows
val rows = res.map(m => Row(m.values.toSeq:_*))
//create the schema from the header
val header = res.head.keys.toList
val schema = StructType(header.map(fieldName => StructField(fieldName, StringType, true)))
//create your rdd
val rdd = sc.parallelize(rows)
//create your dataframe using
val df = spark.createDataFrame(rdd, schema)
您可以使用df.show()输出结果。:
+---+---+---+
| A| B| C|
+---+---+---+
| a1| b1| c1|
| a2| b2| c2|
+---+---+---+
请注意,您也可以通过以下方式创建模式:
val schema = StructType(
List(
StructField("A", StringType, true),
StructField("B", StringType, true),
StructField("C", StringType, true)
)
)