Hbase键值大小太大
我正在使用Spark Streaming下载网页并插入Hbase。我遇到以下例外情况:Hbase键值大小太大,hbase,Hbase,我正在使用Spark Streaming下载网页并插入Hbase。我遇到以下例外情况: WARN scheduler.TaskSetManager: Lost task 13.1 in stage 21.0 (TID 121,test1.server): java.lang.IllegalArgumentException: KeyValue size too large at org.apache.hadoop.hbase.client.HTable.validatePut(HTabl
WARN scheduler.TaskSetManager: Lost task 13.1 in stage 21.0 (TID 121,test1.server): java.lang.IllegalArgumentException: KeyValue size too large
at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1378)
at org.apache.hadoop.hbase.client.HTable.validatePut(HTable.java:1364)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:974)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:941)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:126)
at org.apache.hadoop.hbase.mapreduce.TableOutputFormat$TableRecordWriter.write(TableOutputFormat.java:87)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:1000)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$12.apply(PairRDDFunctions.scala:979)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
我尝试增加hbase.client.keyvalue.maxsize,并将hbase.client.keyvalue.maxsize设置为0表示没有限制。此外,我增加了hdfs.blocksize=256M。但是当我重新启动集群时,仍然会遇到相同的错误:keyvalue变大。
任何想法请,提前谢谢 hbase.client.keyvalue.maxsize是客户端属性。您需要在客户端节点上的hbase-site.xml中设置此属性。也可以在配置对象的代码中设置此属性。 不需要为该属性重新启动HBase