Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/scala/16.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
为列数组制作scala测试用例_Scala_Apache Spark - Fatal编程技术网

为列数组制作scala测试用例

为列数组制作scala测试用例,scala,apache-spark,Scala,Apache Spark,我想为上面的模式创建一个行序列,用于生成测试用例,并希望得到相同的建议。 我试着这样做 |-- column1 integer (nullable = true) |-- column2: array (nullable = true) | |-- element: struct (containsNull = true) | | |-- column21: string (nullable = true) | | |-- column22: string

我想为上面的模式创建一个行序列,用于生成测试用例,并希望得到相同的建议。 我试着这样做

 |-- column1 integer (nullable = true)
 |-- column2: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- column21: string (nullable = true)
 |    |    |-- column22: string (nullable = true)
 |    |    |-- column23: integer (nullable = true)
输出:

  import org.apache.spark.sql.types.{ArrayType, IntegerType, StringType, StructType}
  import spark.implicits._

  val df = Seq(
    (1, Seq(("a", "b", 1))),
    (2, Seq(("c", "d", 2)))
  ).toDF()

  val schema = new StructType()
    .add("column1", IntegerType)
    .add("column2", ArrayType(new StructType()
      .add("column2_1", StringType)
      .add("column2_2", StringType)
      .add("column2_3", IntegerType)
    )
  )


  val df2 = spark.createDataFrame(df.rdd, schema)
  df2.printSchema()
  df2.show()

你能回答这个问题吗?因为我是scala的新手,请帮忙
basicaly the above implementation is wrong.
  import org.apache.spark.sql.types.{ArrayType, IntegerType, StringType, StructType}
  import spark.implicits._

  val df = Seq(
    (1, Seq(("a", "b", 1))),
    (2, Seq(("c", "d", 2)))
  ).toDF()

  val schema = new StructType()
    .add("column1", IntegerType)
    .add("column2", ArrayType(new StructType()
      .add("column2_1", StringType)
      .add("column2_2", StringType)
      .add("column2_3", IntegerType)
    )
  )


  val df2 = spark.createDataFrame(df.rdd, schema)
  df2.printSchema()
  df2.show()
root
 |-- column1: integer (nullable = true)
 |-- column2: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- column2_1: string (nullable = true)
 |    |    |-- column2_2: string (nullable = true)
 |    |    |-- column2_3: integer (nullable = true)

+-------+-----------+
|column1|    column2|
+-------+-----------+
|      1|[[a, b, 1]]|
|      2|[[c, d, 2]]|
+-------+-----------+