Apache spark SparkOnHBase抛出错误“具有不可序列化的结果:org.apache.hadoop.hbase.client.result”
这样构建spark时会抛出错误Apache spark SparkOnHBase抛出错误“具有不可序列化的结果:org.apache.hadoop.hbase.client.result”,apache-spark,serialization,hbase,Apache Spark,Serialization,Hbase,这样构建spark时会抛出错误 config("spark.serializer","org.apache.spark.serializer.JavaSerializer") 错误是 ERROR TaskSetManager: Task 0.0 in stage 2.0 (TID 12) had a not serializable result: org.apache.hadoop.hbase.client.Result.Serialization stack: - object n
config("spark.serializer","org.apache.spark.serializer.JavaSerializer")
错误是
ERROR TaskSetManager: Task 0.0 in stage 2.0 (TID 12) had a not serializable result: org.apache.hadoop.hbase.client.Result.Serialization stack:
- object not serializable (class: org.apache.hadoop.hbase.client.Result
如果我将JavaSerializer更改为KryoSerilizer,它就会工作
但在我的应用程序中,由于服务原因,它必须使用JavaSerializer。您不能使用JavaSerializer序列化HBase结果。您可以使用下面的代码将结果转换为Array[Byte]、java.util.List[Array[Byte]、Array[Byte]、Array[Byte]]
val it = result.listCells().iterator()
val list = new util.ArrayList[(Array[Byte], Array[Byte], Array[Byte])]()
while (it.hasNext) {
val kv: Cell = it.next()
list.add((CellUtil.cloneFamily(kv), CellUtil.cloneQualifier(kv), CellUtil.cloneValue(kv)))
}
(result.getRow(),list)
为什么不是JavaSerializer