Scala Spark-Can';t为结构数组创建架构
我试图为包含结构数组的数据帧创建一个相当简单的模式,但我就是无法让它工作。我在这里读过几个类似的问题,但仍然不起作用。我已经经历了多次迭代。以下是我目前的尝试:Scala Spark-Can';t为结构数组创建架构,scala,apache-spark,Scala,Apache Spark,我试图为包含结构数组的数据帧创建一个相当简单的模式,但我就是无法让它工作。我在这里读过几个类似的问题,但仍然不起作用。我已经经历了多次迭代。以下是我目前的尝试: val theSchema = StructType ( StructField("dateTime",StringType,true), StructField("sys",StringType,true), StructField("attribs",ArrayType(StructType(Struct
val theSchema = StructType (
StructField("dateTime",StringType,true),
StructField("sys",StringType,true),
StructField("attribs",ArrayType(StructType(StructField("attribName",StringType,true), StructField("attribValue",StringType,true)),true),true)
)
如果出现此错误,则会失败:
<console>:29: error: overloaded method value apply with alternatives:
(fields: Array[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType <and>
(fields: java.util.List[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType <and>
(fields: Seq[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType
cannot be applied to (org.apache.spark.sql.types.StructField, org.apache.spark.sql.types.StructField)
StructField("attribs",ArrayType(StructType(StructField("attribName",StringType,true), StructField("attribValue",StringType,true)),true),true)
^
:29:错误:重载的方法值应用于替代项:
(字段:Array[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType
(字段:java.util.List[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType
(字段:Seq[org.apache.spark.sql.types.StructField])org.apache.spark.sql.types.StructType
无法应用于(org.apache.spark.sql.types.StructField、org.apache.spark.sql.types.StructField)
StructField(“attribs”、ArrayType(StructType(StructField(“attribName”、StringType,true)、StructField(“attribValue”、StringType,true))、true)
^
我做错了什么?如果您看到以下签名: 它采用
StructField
s的集合;如API文档中所述,可以将StructType
对象构造为StructType(字段:Seq[StructField])
:
StructType(fields: Array[StructField]) extends DataType with Seq[StructField] with Product with Serializable
import org.apache.spark.sql.types._
val theSchema = StructType(Seq(
StructField("dateTime", StringType, true),
StructField("sys", StringType, true),
StructField("attribs", ArrayType(StructType(Seq(
StructField("attribName", StringType, true),
StructField("attribValue", StringType, true)
)), true), true)
))