Hadoop 糖化血红蛋白聚集_Hadoop_Hbase

Hadoop 糖化血红蛋白聚集

hadoop hbase

Hadoop 糖化血红蛋白聚集,hadoop,hbase,Hadoop,Hbase,我在HBase中的特定列上进行聚合时遇到一些问题这是我尝试的代码片段： Configuration config = HBaseConfiguration.create(); AggregationClient aggregationClient = new AggregationClient(config); Scan scan = new Scan(); scan.addColumn(Bytes.toBytes("drs"), Bytes.toBytes("count"));

我在HBase中的特定列上进行聚合时遇到一些问题

这是我尝试的代码片段：

 Configuration config = HBaseConfiguration.create();
 AggregationClient aggregationClient = new AggregationClient(config);

 Scan scan = new Scan();
 scan.addColumn(Bytes.toBytes("drs"), Bytes.toBytes("count"));

 ColumnInterpreter<Long, Long> ci = new LongColumnInterpreter();

 Long sum = aggregationClient.sum(Bytes.toBytes("DEMO_CALCULATIONS"), ci , scan);
 System.out.println(sum);

Configuration config=HBaseConfiguration.create（）；
AggregationClient AggregationClient=新的AggregationClient（配置）；
扫描=新扫描（）；
scan.addColumn（Bytes.toBytes（“drs”）、Bytes.toBytes（“count”）；
ColumnProductor ci=新的LongColumnProductor（）；
Long sum=aggregationClient.sum（Bytes.toBytes（“演示计算”），ci，scan；
系统输出打印项数（总和）；

sum返回null值。如果我进行行计数，aggregationClient API可以正常工作

我正试着按照方向走

当“count”字段是int时，我使用LongColumn解释器会有问题吗？我在这里遗漏了什么？

您只能使用长（8字节）进行默认设置的求和

因为在AggregateImplementation的getSum方法的代码中，它会尽可能长地处理所有返回的KeyValue

List<KeyValue> results = new ArrayList<KeyValue>();
try {
  boolean hasMoreRows = false;
  do {
    hasMoreRows = scanner.next(results);
    for (KeyValue kv : results) {
      temp = ci.getValue(colFamily, qualifier, kv);
      if (temp != null)
        sumVal = ci.add(sumVal, ci.castToReturnType(temp));
    }
    results.clear();
  } while (hasMoreRows);
} finally {
  scanner.close();
}

public Long getValue(byte[] colFamily, byte[] colQualifier, KeyValue kv)
  throws IOException {
if (kv == null || kv.getValueLength() != Bytes.SIZEOF_LONG)
  return null;
return Bytes.toLong(kv.getBuffer(), kv.getValueOffset());
}