Scala Spark数据帧不支持字符数据类型_Scala_Apache Spark_Apache Spark Sql

Scala Spark数据帧不支持字符数据类型

scala apache-spark

Scala Spark数据帧不支持字符数据类型,scala,apache-spark,apache-spark-sql,Scala,Apache Spark,Apache Spark Sql,我正在从一个文本文件创建一个Spark数据框。假设员工文件包含字符串、Int、Char 创建了一个类： case class Emp ( Name: String, eid: Int, Age: Int, Sex: Char, Sal: Int, City: String) 使用拆分创建RDD1，然后创建RDD2： val textFileRDD2 = textFileRDD1.map(attributes => Emp( attributes(0)

我正在从一个文本文件创建一个Spark数据框。假设员工文件包含字符串、Int、Char

创建了一个类：

case class Emp (
  Name: String, 
  eid: Int, 
  Age: Int, 
  Sex: Char, 
  Sal: Int, 
  City: String)

使用拆分创建RDD1，然后创建RDD2：

val textFileRDD2 = textFileRDD1.map(attributes => Emp(
  attributes(0), 
  attributes(1).toInt, 
  attributes(2).toInt, 
  attributes(3).charAt(0), 
  attributes(4).toInt, 
  attributes(5)))

最终RDD为：

finalRDD = textFileRDD2.toDF

当我创建最终RDD时，它抛出错误：

java.lang.UnsupportedOperationException:未找到scala.Char的编码器”

有人能帮我找出原因和解决方法吗？

Spark SQL没有为

Char

和提供

编码器
您可以使用StringType
：
attributes(3).slice(0, 1)

或者ShortType
（或者BooleanType
，ByteType
，如果您只接受二进制响应）：
attributes(3)(0) match {
   case 'F' => 1: Short
   ...
   case _ => 0: Short
}