Apache spark SparkOnHBase抛出错误“具有不可序列化的结果:org.apache.hadoop.hbase.client.result”

Apache spark SparkOnHBase抛出错误“具有不可序列化的结果:org.apache.hadoop.hbase.client.result”,apache-spark,serialization,hbase,Apache Spark,Serialization,Hbase,这样构建spark时会抛出错误 config("spark.serializer","org.apache.spark.serializer.JavaSerializer") 错误是 ERROR TaskSetManager: Task 0.0 in stage 2.0 (TID 12) had a not serializable result: org.apache.hadoop.hbase.client.Result.Serialization stack: - object n

这样构建spark时会抛出错误

config("spark.serializer","org.apache.spark.serializer.JavaSerializer")
错误是

ERROR TaskSetManager: Task 0.0 in stage 2.0 (TID 12) had a not serializable result: org.apache.hadoop.hbase.client.Result.Serialization stack:
    - object not serializable (class: org.apache.hadoop.hbase.client.Result
如果我将JavaSerializer更改为KryoSerilizer,它就会工作


但在我的应用程序中,由于服务原因,它必须使用JavaSerializer。

您不能使用JavaSerializer序列化HBase结果。您可以使用下面的代码将结果转换为Array[Byte]、java.util.List[Array[Byte]、Array[Byte]、Array[Byte]]

    val it = result.listCells().iterator()
    val list = new util.ArrayList[(Array[Byte], Array[Byte], Array[Byte])]()

    while (it.hasNext) {
      val kv: Cell = it.next()
      list.add((CellUtil.cloneFamily(kv), CellUtil.cloneQualifier(kv), CellUtil.cloneValue(kv)))
    }

    (result.getRow(),list)

为什么不是JavaSerializer