Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/18.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在spark scala中使用dataset设置数组类型_Scala_Apache Spark_Dataset - Fatal编程技术网

如何在spark scala中使用dataset设置数组类型

如何在spark scala中使用dataset设置数组类型,scala,apache-spark,dataset,Scala,Apache Spark,Dataset,我有这样的源数据: {A:123,B:"Hello world",C:[{D:123,E:"Spark"}]} 我有一个目标: case class TestClass (A:Int;B:String;C:???) val obj:Dataset[TestClass] = df.as[TestClass] 如何定义C的类型?一个选项 case class Nested(D: Long, E: String) case class TestClass (A: Long, B:String, C

我有这样的源数据:

{A:123,B:"Hello world",C:[{D:123,E:"Spark"}]}
我有一个目标:

case class TestClass (A:Int;B:String;C:???)
val obj:Dataset[TestClass] = df.as[TestClass]
如何定义C的类型?

一个选项

case class Nested(D: Long, E: String)
case class TestClass (A: Long, B:String, C: Seq[Nested])
用法:

spark.read.json(sc.parallelize(
  Seq("""{"A": 123, "B": "Hello world", "C": [{"D": 123, "E": "Spark"}]}"""
))).as[TestClass].show

+---+-----------+-------------+
|  A|          B|            C|
+---+-----------+-------------+
|123|Hello world|[[123,Spark]]|
+---+-----------+-------------+